Will You Comment On This Post?
On the morning of Thursday October 9th, there was a major code release - three months' worth of updates and fixes were scheduled to go live after extensive testing on the staging server. Usually, releases happen monthly, but due to the Olympics, we didn't release over summer.
The DNA team had planned on rolling out between 7 and 10am, but because of the popularity of Peston's Picks, the blog team felt we couldn't take any chances.
So the developers came in early and took down DNA between 4 and 6am. By 7am, everything was done, the software was back up and everything was working fine. By 10am, all was still well, until the central communities team picked up a minor little problem. Some threads on a single message board were being automatically closed. It wasn't a major problem; rather a minor problem that needed an urgent fix.
Paul and Jay from our communities team ran between their desks and the developer team to try and identify and to isolate the problem. Working closely with Mark and Mark on Martin's DNA developer team, they quickly identified the offending line of code.
Because DNA is used in so many different ways on so many different parts of the BBC website, it's very difficult to set up tests that exactly mimic all the possible different permutations. Very occasionally, a problem will slip through the net.
In less than an hour, the problem was identified and the one line of code was fixed. There were less than 500 threads affected, but they still needed to be reopened. Once again, Paul and the DNA team jumped in and put together a script implementing some of our moderation rules, sort of in reverse, to fix the problem. Shortly after 11am, this was run, and we were back to normal.
Or so we thought.
Sport contacted us saying that a lot of their threads were still closed - the fix had apparently not worked for them. The 606 service is by far the biggest single platform using DNA. Luckily, this turned out to be a caching issue, as we have to cache their boards because they're so big. The cache was cleared and 606 was back to its wild ways.
Not everyone was happy, though, as this unhappy 606 poster posted:
Why am I getting this message, I've been a member since 2006 not yesterday, how is that new 606 has been a complete mess today, sort it out
At the end of business on Thursday, it was also clear that we had another problem: the so-called "new user hole". Despite being around for a while, some users were identified as brand new users. This also meant their comments didn't show up immediately, as they were pre-modded in accordance to our house rules.
However, this turned out to affect only 24 users out of all the tens of thousands, and was fixed by close of business on Friday.
By noon on Thursday, members of the DNA team had already put in more than eight hours at the office. The central communities team had helped them to fix all the problems less than two hours after the first issues cropped up.
There was no data loss at any stage, everything was back to normal and the central communities team was just in time for a boring weekly team meeting.
Join in - leave a comment and become a DNA user yourself.
Tom Van Aardt is Communities Editor, bbc.co.uk.