Thursday 12 July 2012, 11:38
Hi, I'm Richard Cooper, the BBC's Controller of Digital Distribution for BBC Future Media.
As some of you will have noticed, we suffered a major failure of BBC Online last night. The site started to fail at 20:10, and by 20:25 was completely down. It stayed down until 21:10, when it started to recover, and by 21:30 the site was back. Some of you may then have experienced problems accessing some pages between 21:55 and 22:10 as we restored full resilience, and from 22:10 onwards we were back to full operation.
The problem was caused by a failure of the traffic managers in both our data centres.
These traffic managers are a critical part of our infrastructure, responsible for handling all requests to the site, and routing those requests to the right servers. They are designed to be highly reliable, and have served us very well to date.
We are still investigating the root cause of this incident, and I would like to apologise for any inconvenience that this outage may have caused. We are working hard to make sure that the causes of the issue are addressed, and that this does not happen again. I will keep you updated on this blog in the coming days.
Richard Cooper is Controller of Digital Distribution, BBC Future Media
Join the discussion...