« Previous | Main | Next »

BBC Backstage SPARQL Endpoint

Post categories:

Ian Forrester Ian Forrester | 15:13 UK time, Wednesday, 10 June 2009

The Linked Open Data approach to nurturing a next-generation Web is getting lots of attention recently. At the BBC, we've been involved in this approach for the past year and a half or so. It looks to be very promising indeed.

To explain Linked Open Data, Sir Tim Berners-Lee wrote:

"It is about making links [between datasets], so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data."

Over the last few months we've been continuing our expansion of the amount of Linked Data that we're publishing on the BBC /programmes and /music sites, to provide additional detail about episodes of radio and TV programmes, and more links between the data exposed from each site. So, for example, we're now exposing segment data for Radio 2 & 6 Music programmes that link the artists in each section to the relevant data on the /music website.

This provides some really nice ways to navigate and mash-up the two websites. But we've also been wondering: what else could we do? What if there was a way to not only retrieve the data that underlies each page on the website, but also a way to run queries across the whole datasets? This would provide a way to do even more with the data, allowing it to be sliced, diced, queried and analysed in all kinds of new ways.

With this in mind, we've asked two companies who specialise in Linked Data technology (OpenLink Software & Talis) to start regularly crawling the BBC /programmes and /music websites to harvest all of the data and load it into their semantic web platforms. Both platforms allow you to search and query the BBC data in a number of different ways, including SPARQL -- the standard query language for semantic web data. If you're not familiar with SPARQL, the Talis folk have published a tutorial that uses some NASA data.

Talis & OpenLink are regularly crawling and updating the data, and we're working with them on ways to make sure it stays as up to date as possible, but for now expect it to lag a little behind the live data on our sites. But these triplestores already contain metadata for over 300,000 radio and TV episodes, over 6000 series, more than 4000 album reviews, and additional data about thousands of music artists and albums. All of the BBC subject categories and programme genres are also included, so there are plenty of ways to query and slice up the data whether you're interested in a particular type of programme, channel, artist, or person. Where our data links to DBpedia, we can include some additional context -- so for example, all of the music artist information can be queried from one source. And, as we add more data to the /programmes and /music sites, this will all get added.

The Talis Platform

The combined /programmes and /music data is in a store called "bbc-backstage" whose API is

available from: http://api.talis.com/stores/bbc-backstage. The Talis developers have already put together a few example queries and demos which query the dataset, these show how to query the data using AJAX, e.g., fetching lists of music reviewers and their reviews, or analysing relationships between categories of TV programmes.

The OpenLink Virtuoso Platform

The Virtuoso hosted data can be found and queried via http://bbc.openlinksw.com/sparql. In addition, the OpenLink provided Linked Data space offers a faceted browser interface, engine, and REST API, alongside a collection of sample queries.

A richer BBC data API, based on Linked Data

This is a trial project that we're running for six months to explore what the Backstage community can do with BBC data when it's exposed through a richer API than we've been able to provide thus far. We're excited to see what you can create, and in the feedback you can provide us -- so we can learn what works and what doesn't, and make changes. So please do keep us up to date through the BBC Backstage email list.

Enjoy!

Comments

More from this blog...

Topical posts on this blog

Categories

These are some of the popular topics this blog covers.

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.