Thursday 16 August 2012, 16:10
BBC Olympics architecture overview. It shows how many components are involved!
Hi, I'm Matthew Clark, the Senior Technical Architect for BBC Online's Olympic website and apps.
Alongside colleagues Mike Brown and David Holroyd, it's been my responsibility to create the technical strategy that has allowed us to produce successful online Olympic products.
We've focused on the design and development to make sure the site and apps stay reliable and can handle high traffic loads, whilst offering more content than ever before. In this technical blog post I'll be looking at some of the challenges we faced and how we overcame them.
More traffic: Handling unprecedented audience levels
We expected the Olympics would drive far more traffic to our site than ever before, and it did. Planning for this load was not easy. There are over 60,000 dynamically generated pages, many with a significant amount of content on them, so efficient page generation is vital.
Content needs to be as 'live' as possible, so long-term caching is not an option. We use a range of caches (including Content Delivery Networks, Varnish, and mod_cache) to offload the bulk of the traffic from our Apache web servers. For content that's dynamic, cache lifespan (max-age) varies between a few seconds and a few minutes, depending on the context. This is particularly true for the new video player, which needs the latest data every few seconds to compliment the live video stream.
Page generation is done using PHP, which is stateless and receives all of its data through calls to a RESTful API. This API is the Java application layer that retrieves its data from a range of data stores (including MySQL databases, triple stores, and XML content stores). It's the most critical layer from a performance point-of-view, as load is high, calls can involve significant processing, potentially multiple data store calls, and limitations on what can be parallelised. Caching (mod_cache and Memcached) is again used to address the bulk of the traffic.
From data stores to screens, sites and apps
We spent considerable time modelling how traffic creates load on the whole stack. This was first done theoretically (through modelling of user behaviour, load balancing, and caching). We then did it for real - we used data centres around the world to place load equivalent to over a million concurrent users on the site, to confirm everything worked during busy Games load.
More video: handling 2,500 hours of coverage
A moment when all 24 streams were running at the same time. Can you identify what they are?
There's been plenty of discussion about the BBC's 24 video streams, and the challenges of creating them at the International Broadcast Centre by the Olympic Park. This is the equivalent to 24 new channels, offered over cable and satellite as well as via IP.
Once the channels are created, the challenge is to direct viewers to the right content at the right times.
Sport schedules have a habit of changing - extra time, delays due to rain, etc - and the sites and apps need to show this. When an event starts, the tools used by (human) controllers to control the video also log the start in our XML content store. This metadata is then picked up by pages and apps (via an API) so that, within a minute of the event starting, there are links to the content throughout the site, app, and red button. A similar process happens when coverage finishes. Olympic sessions can be as short as 45 minutes, so the faster a video stream can be made available, the better.
More screens: coverage on mobiles, tablets, computers and internet connected TV
The BBC has a four screen strategy where we develop for PCs, tablets, mobiles, and internet connected TV. For the Games we've offered an unprecedented amount of content to all four. In addition, there are Olympic apps for iOS and Android smartphones, a Facebook app, foreign language content for World Service sites, and a red button service for satellite and cable TVs.
Our architecture is the classic multi-tier approach - pushing as much logic as possible into shared components, so that the amount of development for each interface is as low as possible. This is DRY (Don't Repeat Yourself) at a multi-platform level. For web pages, a single PHP codebase creates both the desktop and mobile versions. This includes the iOS and Android apps, which use PhoneGap to 'wrap' the mobile website for most of its functionality. It has saved us having to rewrite functionality in native code. Certain other applications, such as the Olympic Facebook app, are different enough to warrant their own codebase, but still make the same API calls to the Java application layer, where most of the 'business' logic is held.
More content: Data is power
Video aside, there is a wealth of data required to make the Olympic site. The primary source is the Olympic Data Service, which blog posts from Oli Bartlett and Dave Rogers have already covered in depth. In brief, Olympic Broadcasting Services (OBS) provide a comprehensive data feed that covers all sports, and provides a wealth of data - from latest scores to medal tables. This, combined with stories from journalists, and other sources such as Twitter, creates the content for tens of thousands of results, athletes, country, and event pages.
The Dynamic Semantic Publishing (DSP) model, which understands relationships (triples) between all content and concepts, is the process that ensures everything automatically appears in the right place. All created content, including stories, medals, and world records, are tagged (normally automatically) with the appropriate athletes, sports and countries. This causes the content to appear on the appropriate page without human intervention.
In essence, it's this automatic curation of pages that has allowed us to offer such a broad range of product. The automation has kept maintenance to a minimum, freeing journalists to focus on writing content. It's allowed multiple products and thousands of pages to stay up-to-date without a large operational overhead.
More testing: Simulating an entire Olympics
With all this content, data, video, and technology, comes a huge engineering challenge: how do you test it? All development areas follow Test Driven Development (TDD) so there is no shortage of automated unit and component tests. But what happens when you plug everything together? How can you be sure that the right medals go to the right country, or that video works on all devices, or that results appear correctly for all 36 sports? Unlike, say, the football season, the Olympics only happen once every four years, and only last a couple of weeks.
There are no second chances.
We needed to be 100% sure that on day one of the Games, everything would work as expected.
To tackle this we set up an entire team, as big as any development team, with the job of proving everything would work when the Games started. We took a three-pronged approach:
This testing process lasted several months and caught a significant number of bugs and performance problems. Fortunately it paid off - I was a little nervous on the first day of the Games, but it passed without incident.
More for the future
With the end of the Games fast approaching, attention now turns to other areas of the Sport website.
Some features have already been applied throughout - for example, most live video coverage will be in HD from now on. Other services that we've offered for Olympics aren't yet ready for use elsewhere (mobile apps and video chapter points being two). Over the coming months we'll be working on bringing many of these features to the rest of Sport, and perhaps other parts of BBC Online too.
If you've any questions about the technology we've used during the Olympics, please get in touch using the comments below.
Matthew Clark is Senior Technical Architect, Knowledge & Learning, BBC Future Media
Join the discussion...