Posted by Mike Armstrong on , last updated
The launch of the Story of Life app marks the first public outing for a piece of BBC R&D technology that aims to automatically recover subtitles for video clips taken from television programmes. Around 850 of the clips in the Story of Life feature subtitles that have been recovered from our archive of television subtitles and re-purposed to accompany the clips without human intervention. The remaining 200 or so clips have been subtitled manually as they were not available in the archive or did not provide a good enough match.
Making the app accessible was a key aim for the production team in order to make the app enjoyable for the widest possible audience and comply with legislative requirements around the world. However, subtitling more than 1,000 video clips from scratch would have been challenging for a team working on a tight budget. So the team were put in touch with me as I already had some working software, this was a proof of concept demo that could match subtitles to video clips on the BBC's /programmes web site. This piece of work was published as a paper which I presented at IBC2016. For me this was an opportunity to collaborate with a production team and prove the value of my research work on a public facing product.
The subtitle recovery process combines speech to text technology with some fairly straightforward natural language processing and a bit of data science. The Story of Life team supplied a set of over 1,000 video files and a spreadsheet containing metadata about the clips. The audio was extracted from each clip and passed through a speech to text engine to create a transcript. This was then turned into a series of search strings that were combined with the clip metadata to locate a subtitle file for the television programme which best matched the clip. Further processing then worked out whereabouts the clip came from in the programme and extracted the subtitles for the clip. A final process checked to see if the match is sufficiently accurate and creates a subtitle file.
The task threw up some interesting challenges. Because the clip metadata had been entered manually it contained a number of minor errors and inconsistencies. This lead to a quick lesson from colleagues in calculating the Levenshtein distance between strings to find the closest match. It also became clear that I was trying to match subtitle data for the UK versions of the programmes to clips, which had been taken from the international versions. Some of the international versions had been edited to make room for advertising, leading to mismatches. In these cases the subtitles needed to be edited manually. In other cases the style of the subtitles may seem unfamiliar, for example three line subtitles, that were once commonplace on UK TV have now largely fallen out of use. Also, in a few cases programmes were re-voiced, for example to replace imperial units with metric, so the subtitles in some of the clips may differ by just a couple of words, whilst conveying the same information.
The challenge of recovering subtitle files for the Story of Life app has been valuable in proving the effectiveness of the technique and has provided it with its first public exposure. Over 80% of the clips have been subtitled without human intervention and partial files have been recovered that could be used with some editing. Also the research has benefited from the challenge of using manually entered metadata and the lessons learnt will inform future work. It has also been a pleasure to work with an enthusiastic production team which has provided support and encouragement for this work.