Thursday 12 July 2012, 11:38
Hi, I'm Richard Cooper, the BBC's Controller of Digital Distribution for BBC Future Media.
As some of you will have noticed, we suffered a major failure of BBC Online last night. The site started to fail at 20:10, and by 20:25 was completely down. It stayed down until 21:10, when it started to recover, and by 21:30 the site was back. Some of you may then have experienced problems accessing some pages between 21:55 and 22:10 as we restored full resilience, and from 22:10 onwards we were back to full operation.
The problem was caused by a failure of the traffic managers in both our data centres.
These traffic managers are a critical part of our infrastructure, responsible for handling all requests to the site, and routing those requests to the right servers. They are designed to be highly reliable, and have served us very well to date.
We are still investigating the root cause of this incident, and I would like to apologise for any inconvenience that this outage may have caused. We are working hard to make sure that the causes of the issue are addressed, and that this does not happen again. I will keep you updated on this blog in the coming days.
Richard Cooper is Controller of Digital Distribution, BBC Future Media
All posts are reactively-moderated and must obey the house rules.
Monday 9 July 2012, 15:00
Friday 13 July 2012, 08:30
Comment number 1.
Riz12th July 2012 - 12:33
Was that a hardware or a software failure of the traffic managers?
Link to this (Comment number 1)
Comment number 2.
Jules12th July 2012 - 12:43
Your post suggests there are two physical systems so presumably there are separate traffic manager with redundancy in each data centre (or is that not correct)?
If not why not, and if so, what took both out at the same time - if indeed that is what happened, because one can only assume some kind of attack . . .
Link to this (Comment number 2)
Comment number 3.
Dr Prithwiraj Das12th July 2012 - 12:43
At one point (~2130) BBC iplayer told me I needed to be in the UK (I am in the UK and TV-license paying), which made me check my broadband connection (which turned out fine).
This post makes it clearer that it probably was the traffic managers at the BBC data centres then trying to reroute me while they were sputtering back to life. Thanks.
Link to this (Comment number 3)
Comment number 4.
seeyouonthewaydown12th July 2012 - 12:58
"website_outage_june_11_2012.html" or July 11th?
Link to this (Comment number 4)
Comment number 5.
Ian McDonald12th July 2012 - 13:10
@SeeYouOnTheWayDown
Well spotted. Since changing the web address would break any existing links, I won't change it; but as the blog post says, the outage was on July 11th.
Link to this (Comment number 5)
Comments 5 of 18