« Previous | Main | Next »

BBC Online Outage on Tuesday 29 March 2011

Post categories:

Richard Cooper Richard Cooper | 09:47 UK time, Wednesday, 30 March 2011

As many of you will have noticed (and reported on Twitter) the whole of BBC Online was down last night for an hour from 22:40 due to a major network incident. We would like to apologise to everyone that was unable to access BBC Online during this outage.

Our systems are designed to be sufficiently resilient (multiple systems, and multiple data centres) to make an outage like this extremely unlikely.  However, I'm afraid that last night we suffered multiple failures, with the result that the whole site went down. Enough of the systems were restored to bring BBC Online pretty well back to normal by 23:45, and we were fully resilient again by 04:00 this morning.

For the more technically minded, this was a failure in the systems that perform two functions.  The first is the aggregation of network traffic from the BBC's hosting centres to the internet.  The second is the announcement of 'routes' onto the internet that allows BBC Online to be 'found.'  With both of these having failed, we really were down!

We'll be taking a very hard look at what we need to do to make sure that this doesn't happen again.

Richard Cooper is Controller, Digital Distribution, BBC Future Media.

Comments

  • Comment number 1.

    Outside attack?

  • Comment number 2.

    Are you also looking at having secondary offsite DNS servers.

    I couldn't even get an non-authorative reply to a `dig bbc.co.uk`. Your DNS was offline (probably due to your routing failure).

    You've got four name servers
    ns1.bbc.co.uk. 710 IN A 132.185.132.21
    ns1.thdo.bbc.co.uk. 49979 IN A 212.58.224.21
    ns1.thls.bbc.co.uk. 49979 IN A 132.185.240.21
    ns1.rbsov.bbc.co.uk. 49979 IN A 212.58.227.48

    but only two networks hosting them.

  • Comment number 3.

    *29th* March...

  • Comment number 4.

    Do you mean the headline "BBC Online Outage on Tuesday 23rd March 2011" when the article was written on the 30th and mentions "nast night" and "yesterday"? Maybe the BBC Online Server(s)'s clock was also reset?

  • Comment number 5.

    @dr_salleee, @Richard: give the guy a break, I don't think he got much sleep last night :-/

  • Comment number 6.

    @brendan :) I won't mention that the article initially even missed out the word March, then!

  • Comment number 7.

    Thanks for your support Brendan. It was of course the 29th (i.e. last night) - my error which I am now endevouring to correct.

    Apologies

  • Comment number 8.

    Beneath this calm blog post, I'm picturing someone receiving a proper Malcolm Tucker style bollocking. Please tell me that's the case?! Hearing everything's fine, all smiles and chamomile tea would be somewhat disappointing...

  • Comment number 9.

    The BBC News report on this ( http://www.bbc.co.uk/news/technology-12904586 ) keeps being updated, which is nice. I rather liked the first version I read, which hinted that the technical solution deployed was to switch the BBC off, wait 10 seconds, then switch it back on again. This colourful little detail vanished from later versions.

    Curiously however, the 'Last updated' time remains ever fixed at 08:51 through all the several updates to the report.

  • Comment number 10.

    Synchronium - I agree. I'm going to keep that image in my mind no matter where the truth lies. I still appreciate an honest blog post with comments like this.

  • Comment number 11.

    My first port of call when it all went down was to check News 24. I thought there might be an explanation there, or at least an acknowledgment on the on-screen ticker.

    I don't know if this is true but I would expect that the BBC's online news reaches far more millions of people than the BBC's TV news these days.

    Have we not reached a point where TV should in some instances serve the web? What I mean is, if BBC TV went down, there would be an instant announcement on the website. Why not the other way round??

  • Comment number 12.

    Let's all get over it.

    It's a website. It went down for a couple of hours. Bad things happen. The world keeps turning and, thankfully, the BBC isn't as important as it thinks it is.

  • Comment number 13.

    have to laugh at the dip stick below giving you technical advice. LOL.

  • Comment number 14.

    Besides which he is wrong, the nameservers are on four different networks but being announced from the same Autonomous System.

  • Comment number 15.

    iPlayer is looking different and behaving oddly (as in showing as coming up next programmes that have already been broadcast). Either it's a deliberate revamp (inc glitch) or a spin-off from the outage?

  • Comment number 16.

    "announcement of 'routes' onto the internet that allows BBC Online to be 'found'." What is that? how about in interweb speak, does it mean DNS? I fear the BBC is heading for 'series of tubes' moment (google it).

    Say what it is! don't explain it with a personal understanding. If you say what it is and peope don't understand they can reserch and learn or at least type it into google, if you write nonsense like this people just have to guess what you mean.

    You(BBC) wouldn't replace the word 'orange' with 'a colour half way between yellow and red' so please don't do it with technology. (My point is, it's easier to look up 'orange', than 'a colour half way between yellow and red', especially so if your first language is not english.)

  • Comment number 17.

    @Guy (#15) I *think* the outage may have had a knock-on effect upon some of the processes which feed data into the systems which drive iPlayer — it should catch up after a while.

  • Comment number 18.

    Is this related to the fact that After Midnight with Linley Hamilton which went out at 12:05 am Monday, 28th March is silent on the iPlayer?

    Or was this another 'extremely unlikely' event?

  • Comment number 19.

    Would these problems have anything to do with the fact that at 7.30 Tuesday evening any recording we tried to set up showed as 'Sunday 27th March'? If we linked our recorder to BBC 1 that was the default date, whereas BBC 2 showed the date and time correctly.

  • Comment number 20.

    In Reply to : "announcement of 'routes' onto the internet that allows BBC Online to be 'found'." What is that? how about in interweb speak, does it mean DNS? I fear the BBC is heading for 'series of tubes' moment (google it) "

    This is a common term. Annoucing routes is used globally by members of the technical community , as noted in the blog for the "technically minded".

    To extend upon the blog posting , annoucing / advertising routes have a look at : http://en.wikipedia.org/wiki/Border_Gateway_Protocol


    Human error is usually a common fault with BGP as in the case of Facebook being blocked in many middle eastern parts occassionally, or when AT&T traffic ended up being routed via china , sometimes naughty people attempt to poison and disrupt though throgh publishing incorrect routes into the upstream networks.

    Seeing a couple of IP address's for DNS servers doesn't mean they only have 4. IP's can be BGP Anycasted to anywhere in the world. eg Google's DNS server , 8.8.8.8 translates into many address's hidden behind the gateways.
    http://en.wikipedia.org/wiki/Anycast

    If you'd like to see further google for looking glass servers , or eg http://www.bgp4.as/looking-glasses .

  • Comment number 21.

    Just to point out, the hyperlink within "Topical posts on this blog" stills states "Tuesday 23rd 2011"..

  • Comment number 22.

    Verbal Spillage - this is a bug which we are aware of. Thanks.

  • Comment number 23.

    Oh dear!- he has almost written the two things which certainly indicate that such and such will certainly happen again:

    'lessons have been learned' and 'robust systems are now in place to ensure that this does not happen again'. If you see that coming from the NHS, a recurrence in inevitable.

  • Comment number 24.

    William - Richard has not said either of these things. The NHS is off topic.

    Thanks

  • Comment number 25.

    Stuff happens. No one died, move on.

  • Comment number 26.

    7. At 10:55am on 30th Mar 2011, Nick Reynolds wrote:
    Thanks for your support Brendan. It was of course the 29th (i.e. last night) - my error which I am now endevouring to correct.

    Apologies


    Your error? So who wrote this blog? What part did you play in this blog?

  • Comment number 27.

    OfficerDibble - as the blog editor I am currently doing all the work required to input finished copy and publish it. The text of the blog was written by Richard. I put in the wrong date in the title by accident (as I've already explained).

    Thanks

  • Comment number 28.

    Thanks Mo (#17). The glitches on iPlayer seem to be sorted out but it looks like there has been (another) revamp in design - the change looks v permanent - even though I haven't seen anything said or blogged about it.

    A pity as I don't find it as good as it was. Not only is it more clunky but also more black and white and info when playing, especially if buffering, doesn't stand out so much. Better as it was with playing and info below programme screen rather above (IMHO).

  • Comment number 29.

    Synchronium: I can absolutely guarantee that there are people getting a massive Malcolm Tucker style bollocking over this! Having worked at the BBC and Siemens for some years in the past, I know this wouldnt have been taken lightly. Rest assured that some people will have been dragged over some serious coals for this - for sure. Wont be been pretty. In fact, the aftermath will still be going on for at least another week from now I imagine.

  • Comment number 30.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 31.

    On the bright side, since both went down this time then epic work to overturn the outrages should mean that there will be less chance of one or both happening again.

  • Comment number 32.

    I know the BBC is making cuts, but don't u think u should have a backup team of hamsters for when the first team hamsters get tired out? :)

  • Comment number 33.

    On the Newsworthiness of the BBC's outage....

    I am a little concerned that I failed to hear any mention of the BBC's internet being down on any of then BBC's news services on TV or Radio.

    Was the because the BBC did not think it was newsworthy or that a tragedy at home is best hushed up? The twitterati went into overdrive with conspiracy theories which could have been scotched if only the BBC radio or TV news had made mention of the domestic incident in a timely manner. The lesson is:- it is better to suss-up quickly as this reduced to collateral damage.

  • Comment number 34.

    Huh, just an hour, I'm in Thailand my local online English newpaper has been on the blink for over a week. You guys should try living in the 3rd world for a bit:D

  • Comment number 35.

    Richard, do you know if the outage affected the commenting system as well... seems to be hanging on occasion despite trying the BBC site on a number of different computers/ browsers? Hopefully this goes through!

  • Comment number 36.

    what a shambles the BBC news is still off line, most annoying as @ 14.10hrs Tuesday 12th April

  • Comment number 37.

    I visited this page first time and found it Very Good Job of acknowledgment and a marvelous source of info.........Thanks Admin!

    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 38.

    Unbelievable!? Is this the best you can come out with to earn our millions in TV licence? The BBC does not have the most intelligent on its staff, they are out here watching reading and listening. Unless your purchase department is corrupt, any credible IT supplier and designer would have insured that "THE BBC" was well backed up to do the job they did for almost a century. I suspect the BBC needed to get rid of much unbearable criticism from their audience and its handling of news and politics. Shake up! Clean your act!

 

More from this blog...

BBC iD

Sign in

BBC navigation

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.