A research project to measure and improve subtitle quality
This is a recent set of work reviewing the problems of subtitling and how they can be overcome on all our platforms
What we're doing
We are examining the issues which impact on the quality and availability of subtitles for our audience across all our platforms.
We first looked into ways in which we might use language models for individual programme topics to improve the performance of speech to text engines and to detect errors in existing subtitles. We have had some early success modelling weather forecast subtitles which suggests there may be some value in this approach, but it would appear that other topics will be less successful. See White Paper WHP 256: 'Candidate Techniques for Improving Live Subtitle Quality' for more details.
At the request of our Technology, Distribution and Archives, Solution Design team we carried out a ground-breaking study into the relative impact of subtitle delay and subtitle accuracy. This work required the development of new test methodologies based on industry standards for measuring audio quality. A user study was carried out in December 2012 with a broad sample of people who regularly use subtitles when watching television (photo above). The results were presented at IBC2013 in September and are available as White Paper WHP259: 'The Development of a Methodology to Evaluate the Perceived Quality of Live TV Subtitles'.
BBC Audiences have run some surveys for us to provide background data on the level of use of subtitles and how people are using them and what issues they have. More recently we are starting to examine the iPlayer statistics on subtitle use as they have the potential to give us insight into the use of subtitles on a programme by programme basis. We have also started building an automatic subtitle monitoring tool to allow us to track long term trends with issues that we can measure, such as position and reading rate, as originally outlined in White Paper WHP 255: 'Measurement of Subtitle Quality: an R&D Perspective'.
We have done some preliminary work looking at subtitle placement from the point of view of making the experience more immersive by placing the subtitles closer to the speaker and looking at how we can avoid placing subtitles over important parts of the scene. We published some early thoughts on this work in a recent paper at TVX2014 called Enhancing Subtitles. A further paper presented at TVX2014 looked at the impact of using rapid serial visual presentation for subtitles
We have explored ways to take the live broadcast subtitles and carry out automatic post-processing to remove the original delay and improve the formatting. Early results were promising and we have talked to the iPlayer team about the potential for this work. We have also looked at how live subtitles could be realigned and reformatted during streaming.
Over the past year we have developed a way of matching up video clips on our web pages to the same piece of video in our broadcast archive in order to locate matching subtitles for the web video. Or prototype has focused on the News web pages where it is able to find matches for around 40% of the video clips. We have applied for a patent for our technique and have written it up as a paper to go into this year's NAB conference.
During December 2014 we ran a set of user research experiments in collaboration with Mike Crabb on placement from University of Dundee. This research has been looking at how subtitles could be presented with a video clip on a web page, adjustment of subtitle size and followup work on dynamic subtitles. This work is in the process of being written up and three papers have already been submitted for a forthcoming conference. At least two more papers are planned from this work and it will form part of a paper proposed for IBC2015.
Over the coming months we are planning a further set of user research, this time looking at the issue of reading rate and the the display of subtitles on devices like tablets and mobile phones.
Subtitles presented using rapid serial visual presentation
A prototype automated subtitle monitoring system built to provide background data for research on subtitle quality.
Getting people excited about our user-centred approach to subtitles...
An outline of the projects and papers that BBC Research & Development are taking to IBC 2013
More project info
Why it matters
At least 7 million people use subtitles regularly and mostly for reasons other than hearing difficulties. This is a large audience for subtitles. Whilst the quality of subtitling for pre-recorded programmes is very good, subtitling for live programmes faces problems of accuracy and delay.
The delay in the arrival of the subtitles is a particular problem for people in our audience with hearing difficulties as they are watching with the sound on and using the text to supplement their understanding. These people will often turn the subtitles off if they are late as they are too confusing.
For people watching without sound the delay isn't quite as bad, but because the subtitles are their only source of information the accuracy of the subtitles is most important.
We are aiming to contribute to improvements to subtitling quality and availability, for broadcast, on demand, streamed and web content over the coming years.
How it works
We are using speech recognition and language modelling tools to look at processes that can be used to realign and reformat subtitles for later streaming. We are also carrying out user research to measure the impact of various issues on the perceived quality of subtitles.
We demonstrated some of our work at IBC2013, 12th to 17th September in Amsterdam
An R&D White Paper based on the IBC paper is now online.