Music Showcase, a development perspective
We've recently released the latest addition to the BBC Music website - The AV Showcase. As a development team we're very proud of the work we've done, and excited about developing the showcase in the future. As we've taken some innovative decisions on the architecture of the Showcase we'd like to share them with you in more detail and welcome comments and ideas from the wider development community.
One of the key design challenges we face in this project is allowing you to listen to audio or video clips while browsing the showcase to find more content. There were many ways we could have achieved this. Some sites use a pop-up "player" window, while others embed a player in an iframe on the page. As Sacha discusses in his post on the showcase's design, we really wanted the playing interface to feel like part of the whole site, and to be seamlessly integrated into the navigation, while still allowing the focus - the music - to be uninterrupted by exploration of other content.
One of the traditional weaknesses of an Ajax application is the lack of web pages to back it up. Using the xmlhttprequest header we have been able to vary whether we serve up the full page, or just the content required by an Ajax request, resulting in a full 'static' site to back up the Ajax version.
As well as providing an aggregation of all of the exciting music content the BBC has to offer from across our range of services, one of our intentions for the Showcase was to provide interesting onward journeys from individual clips. This is only possible by having a rich set of metadata for each item of content. From a technical point of view this involved looking for identifiers that were reusable and unique and 'tagging' our content with them.
The individual clips are tagged by editorial staff with Musicbrainz identifiers which represent the individual performing artists featured in the clip. Musicbrainz IDs are an example of web-scale identifiers. These are identifiers for things, in this case musicians, that are in common use across the Internet. Our use of them makes it very trivial to, for example, create links to pages on the Music site (our artist page URLs follow the format https://bbc.co.uk/music/artists/) or to retrieve related information for an artist from a 3rd-party API.
We will shortly be adding a "related clips" feature. When viewing a particular clip in our larger "theatre view" we will display clips which relate to the current clip. This is achieved in two ways, firstly by simply searching the available clips for others tagged with the same artists. Secondly we send the musicbrainz identifier to The Echo Nest, a music metadata provider, and use their 'similar' API to obtain a list of artists which are similar to the ones featured in the clip. If we have content for those artists we display this too. Using robust and unique Musicbrainz identifiers makes such API calls easy to implement, and also will allow us in future to integrate with other services that adopt musicbrainz.
Clips in our system are also tagged with dbpedia identifiers. This allows more general metadata to be applied to a clip, for example locations or events. While we're not currently exposing this information through the Showcase, it could allow in the future further aggregations of Music content as well as interesting horizontal journeys to the rest of the BBC's on-line content.
To allow finding content by your favourite artists we have a "quickfind" box in the corner of showcase.
For such quickfind tools to be useful, the response times have to be fast. We use the Glow Autosuggest widget to provide the interaction. Powering the search itself we use Apache Solr, a Lucene-backed search engine that runs on Tomcat. Periodically, triggered by Quartz, we fetch the full list of clips available in the showcase and a list of all the artist pages we publish on /music. We weight the search results for artist pages slightly to reflect the amount of content we have for each artist. When making a search request, the parameters are proxied via a small PHP application which sorts and formats the results from Solr, and also provides a layer of caching for common search terms.
Solr and its indexer are running on four Tomcat application servers split between two data centres. One is designated the master; it is write only and is the instance that gets periodically updated. The other three are slaves which poll the master for updates. When an update is detected the slave will then update itself.
The indexer runs on all four of the servers but will only update on the one designated as master. If the master should fail then a new server can be designated as the master and the indexer will switch to updating that one, with the remaining slaves polling this new master.
We have been extremely pleased with the performance of the system. The average request time is around 18ms on our "as-live" system.
It's great to finally launch something that we've been working on in secret, but for the development team the hard work is just starting. There are still some outstanding performance issues and other bugs to address in the Showcase itself, but on the horizon we have some exciting features planned to improve the integration with other music services, to provide richer journeys to other content around the BBC and to allow you to share the content you find with other people. One of the main things still be tackled is full accessibility, and we are currently working on implementing full Aria support. Let us know what you think so far, and feel free to ask questions in the comments. We'll do our best to answer them!
Chris Lowis is Software Engineer, FM&T Programmes & On Demand
...and full credit to the rest of the Music development team.