Posted by Chrissy Pocock-Nugent on

Latest sprint notes from the IRFS team, Face Detection, Talking with Machines, Atomised Media and more!


Chris continued his work on the Starfruit tagging suggestion tools, and explored the performance of models combining data from BBC News and Sport. He also started looking at several approaches to incremental training. The News tagging team have given feedback that they’re impressed with Starfruit and want to trial it.

Editorial Algorithms

Fionntán deployed his new Salience API, which takes in a creative work's text and entities and predicts the entities' importance to the creative work. This needed research and testing on modern frameworks to deploy a machine learning classifier in to production. In the end, he used AWS Lambda, which can automatically and cheaply handle such issues as scaling and provisioning.

David refined the designs for the stream builder UI, designing a more compact and information-rich view, based on feedback from colleagues in BBC Monitoring.

Kate continued her research on how BBC journalists create live streams for events, sports and local news, and produced a detailed audit of how content is aggregated in several such streams.

Manish worked on the deployment of the new stream builder app into the current production setup.

Salience/Mango Integration

Chris Newell worked with Fionntán to integrate his salience function into the Mango Semantic Tagging system. This will hopefully improve the accuracy of the relevance score, which is used for ordering and filtering the results.

Atomised Media

Tim attended a workshop run by Jon Tutcher in the North Lab, where he discussed the types of metadata needed to produce and deliver object-based / atomised media. Barbara, Joanne and Tim have been analysing the transcripts from the Newsbeat Explains user study and pulling out the main findings: generally, feedback has been positive. In order to get a more holistic view of how the prototype went, Barbara has also prepared interview questions to be shared with the Newsbeat journalists. The objective is to get feedback from the editorial point of view of the pros/cons and challenges of using our format for creating news stories.

Talking with Machines

The team have been working on the spoken radio player, focusing on getting it stable and tidy for a beta release. This included setting up a list of common misheard programme names, which increases the reliability of voice requests for shows. Preparation also began for a short design project with Children’s; we’re going to be looking at use cases for voice interfaces for children and families.

Face Detection and Recognition

Jana has been working on face recognition - she has been investigating the number of images required per person to gain an acceptable level of precision and recall. She found that with 500 training images per person she was able to get a 99% precision rate. Ben has used this tool to create a face recognition model to identify BBC Breakfast presenters. He then created a demo of this, which he integrated into the CAT frontend. Ben also worked on integrating two ground truth tools, pybossa and vatic, to make it easier to generate face detection data.

Denise continued her research into real-time face detection. She has been testing the Dlib and OpenCV face detection algorithms for performance and accuracy.

Speech to Text

Research continued on building a speech to text system using Tensorflow, a custom dictionary and a language model was integrated into the system. Ben investigated near live speech-to-text using kaldi gstreamer - he was able to integrate our own custom Kaldi model.


The ability to read and write files to IP studio was added to COMMA this sprint. More machines were provisioned as COMMA workers and Forge Client Certs were added on workers so they can talk to servers in Cosmos.

Scene Detection

Craig built a test harness to assess open source shot detectors. He was able to extract average colour information from shots and calculate the performance of each tool with respect to source material e.g. sport, drama.


Chris continues work on the W3C TV Control API, where the Working Group is changing the structure of the API to better support multiple streams, while allowing implementations to manage resources and add user permissions where required. This should also lead to an API that is simpler for application developers. He has also contributed to evaluation of the discovery and communication protocols needed to support the Presentation API and Remote Playback API.

Together with this year's group of R&D trainees, Chris and Tim attended the next lecture in the series on DSP, which covered sampling theory.

Lastly, we would like to say goodbye to Thomas P who left at the end of the year, and welcome Josh who has joined as a software engineer working with the Discovery Team. We also welcome back Sean from secondment at EBU.

Interesting Links