Treating radio 4 output as data
Editor's note: BBC techies have been working with their counterparts at The Guardian and elsewhere to build new sources of data - in this case data about the media appearances of our MPs - SB.
At the end of July the Guardian held an internal hackday at their offices in King's Cross. They invited two engineers from BBC Radio's A&Mi department, Chris Lowis and David Rogers. We teamed up with Leigh Dodds & Ian Davis from Semantic Web specialists, Talis to produce an 'Interactive-MP-Media-Appearance-Timeline' by mashing up data from BBC Programmes and the Guardian's website.
Before the event Talis extracted data about MPs from the Guardian's Open Platform API and converted it into a Linked Datastore. This store contains data about every British MP, the Guardian articles in which they have appeared, a photo, related links and other data. Talis also provide a SPARQL endpoint to allow searching and extraction of the data from the store.
Coincidentally, the BBC programmes data is also available as a linked datastore. By crawling this data using the MP's name as the search key we were able to extract information about the TV and radio programmes in which a given MP had appeared. A second datastore was created from the combination of these two datasets, and by pulling in some related data from dbpedia. Using this new datastore we created a web application containing an embedded visualisation of the data.
Continue to read this post and leave comments on the BBC Internet blog, where it originally appeared..