The BBC has a vast catalogue of old TV and Radio programmes it would like to make searchable. COMMA is a platform for processing these cheaply and at scale.
Project from -
What we've done
COMMA was a 2-year project funded by the Technology Strategy Board to develop a prototype platform for the extraction of metadata from media archives. The project completed in 2015 and the platform is now in use by the BBC.
It was funded as part of the TSB's Collaborative R&D initiative. This aims to encourage innovation in the digital economy by funding partnerships between the public sector, business and academia. Our commercial partners in the project were Somethin' Else, a London-based content design and creation company, and Kite, an internet development consultancy.
Why it matters
There are many cultural institutions, commercial archives and content creators who have audio, film, photos and video that they would like to put to new uses.
The first step is usually digitisation, but the danger with a big digitisation project is you simply swap out an under-used physical archive for its digital equivalent. Without easy ways to navigate the data there's no way for your users to get to the bits they want.
Luckily, help is at hand in the shape of technologies like speech-to-text, face recognition, speaker recognition and a host of other metadata extraction algorithms. These can help unlock the value in media collections by making specific bits within the video or audio instantly findable.
COMMA is a platform that can help process content through algorithms like this cheaply and at scale. It is easy-to-use, fault-tolerant and flexible. It is currently being used in-house by the BBC for a variety of metadata processing tasks.
The project has now completed. However, we're interested in the needs of other public sector bodies or commercial companies who think they could benefit from this platform.
If you have any press or business-related questions about the project feel free to contact the COMMA team.
This project is part of the Internet Research and Future Services section
This project is part of the ABC-IP work stream
This project is part of the Content Analysis Toolkit work stream