« Previous | Main | Next »

Prototyping Weeknotes #96

Post categories:

Andrew Nicolaou | 14:32 UK time, Monday, 20 February 2012

This week's featured project is ABC-IP, a two year collaborative research project, part funded by the Technology Strategy Board, examining ways to automatically link together different sources of metadata around large video and audio collections.

One particularly interesting dataset we have access to is the entire World Service audio archive. This spans several decades, and consists of about 26,000 hours (three years in total) of audio content.

This World Service archive has traditionally been very different from other programme datasets at the BBC. The only overlap is with BBC Programmes, for programmes broadcast since May 2008.

The data available is quite patchy: a number of programmes claim to have been first broadcast after now (e.g. 2099), or before the start of the BBC (e.g. 1900), for example. However, the actual audio content is high quality and the content itself is the usual excellent World Service mix of education, information and entertainment.

We worked on trying to make this archive searchable, and linked up with other datasets at the BBC and outside, by analysing the content of the programmes and automatically classifying them with DBpedia URIs.

Kiwi-API web interface

First, we developed an algorithm allowing us to do that with reasonable accuracy. We're working on releasing a Python implementation of it on Github, which will be described in further detail on this blog.

The next stage was to apply this algorithm to the whole World Service archive. We developed an API to manage and distribute the processing across a large number of Amazon EC2 instances, and successfully used it for automatically tagging around 27,000 programmes in about a week, for a predictable cost.

We'll present this work, and some applications of the resulting tags (like the Tellytopic prototype our partner Metabroadcast blogged about), at WWW'2012 in Lyon, next April.

The TellyTopic prototype by MetaBroadcast

Meanwhile, we've started to investigate the challenges around presenting very large, partially described archives, looking at the design challenges for tag based navigation and retrieval. We're very interested in how we can engage the audience to improve this data.

In other project news:


The diary study has started and whilst we're collecting data, Joanne and Penny are planning a questionnaire and materials for the lab study. Andrew's been building a dashboard showing the users' programme data using Knockout and d3.

Chris N. has been working with Barbara on discussing and defining technical enablers for the project and to start thinking about the scope of the prototype we'll need to demo at the end of the first phase of the project. This week has been filled with project deadlines, with a deliverable about large-scale testing of use cases. Preparations are underway to welcome all the project partners to London for the next pan-European project meeting we're hosting. Akua has been sorting out the important logisitical work of getting 30 people here and working out where to put them.

EBU Radio Week

Chris L., Dan, George, Libby and Sean were in Geneva for the EBU Radio Week Summit and RadioHack event. George gave an overview of our recent work on a RadioTAG trial to the assembled Radio Summit delegates (think - business atire) and Chris L gave a talk about RadioTAG aimed at developers, while everyone saw a lot of interesting presentations. Of particular interest was the developments in the RadioEPG specification. It brings programme information to radios, allowing people to listen to their favourite stations even if they have to switch between IP and Brodcast streams.

Dan and Chris also enjoyed seeing how open-source sofware was being used to make community radio affordable in Denmark and France. The open-source DAB transmission stack allows Kanal Plus in Copenhagen to broadcast on behalf of 43 local stations for a hardware outlay of about €500 each.

Libby enjoyed this radio encased in lego. The cheap DAB radio inside is enabling educational projects. Find more photos from the event on the RadioHack facebook page.

W3C Audio Working Group

Olivier spent a chunk of the week on W3C Audio Working Group business, especially on editing the document for Use Cases and Requirements and then matching up use cases and requirements.

Meanwhile, Matt joined the working group so he spent some time finding his way around the Chrome and Mozilla audio APIs whilst thinking about use cases and demonstrators.

Interesting links

Finally, here's a round-up of interesting links from the team.



More from this blog...

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.