In R&D's Broadcast and Connected Systems team we've done a lot of research into how the BBC can best deliver audio and video content to viewers via the Internet, and in particular how we might do this using adaptive bitrate technology. But, of course, audio and video aren't the only things that we need to deliver to viewers - we also need to deliver subtitles!
Over the past few months, we've been investigating whether a new subtitling format called EBU-TT-D can be used to deliver subtitles within an MPEG DASH stream. The result of these investigations is the first working end-to-end demonstration of EBU-TT-D subtitles being delivered via MPEG DASH and displayed by a client.
Our preferred technology for delivering BBC programmes via the Internet in the future is MPEG DASH, so what we ideally need is a way to deliver subtitles for live and pre-recorded programmes in the same MPEG DASH streams that are being used to deliver their video and audio. Step forward EBU-TT-D...
EBU-TT-D is a new subtitling format defined by the European Broadcasting Union (EBU) that builds upon the W3C Timed Text Markup Language (TTML). The EBU has actually defined two new subtitle formats: EBU-TT and EBU-TT-D.
EBU-TT is a format designed primarily for exchanging subtitles between different parts of the production chain and for archiving them. It allows broadcasters to include within subtitle files extra information that would be helpful in an archival or production environment (e.g., programme and episode titles), but which wouldn't be of interest to the clients (TVs, set-top boxes, tablets, etc.) that would actually be showing the subtitles to their intended audience.
EBU-TT-D, on the other hand, is a format specially tailored for distributing subtitles to the clients that will display them. It has been designed specifically to work with IP distribution technologies like MPEG DASH, and, thanks to its basis in W3C TTML, it’s well suited for distributing subtitles via the Web in general. EBU-TT-D subtitle files, though structurally similar to EBU-TT files, include only the information that clients would need to correctly display the subtitles; they are also designed to be simpler to process than EBU-TT files. As you might expect, the EBU designed these new formats in such a way that it’s straightforward to convert an EBU-TT file created by a production system into a EBU-TT-D file ready for distribution.
Version 1.0 of the EBU-TT-D specification was released in January 2014, and the use of EBU-TT-D for delivering subtitles is included in both the DVB DASH profile and the newly-released HbbTV 2.0 specification.
What we need to do is to demonstrate that we can indeed use EBU-TT-D to deliver subtitles alongside audio and video in an MPEG DASH stream, and that it's possible to create a client that can correctly receive and display these subtitles. And that's exactly what we've been doing over the last few months.
EBU-TT-D test stream
First, we needed to create some test content: an MPEG DASH stream containing audio, video and EBU-TT-D subtitles. For this, we decided to use the open source short film Elephants Dream, as it's rights-free and there were already subtitles available for it. These pre-existing subtitles were stored in a single file containing all of the subtitles for the entirety of the film. As this file wasn't already in EBU-TT-D format, our first task was to translate it into EBU-TT-D.
Once we had an EBU-TT-D file for the whole of Elephants Dream, we had to split it into a number of separate subtitle files, each containing the subtitles for a unique section of the film. The reason for this is that in MPEG DASH the components of a programme (audio, video and subtitles) are typically delivered as a sequence of individual segments, each containing a few seconds of audio, video or subtitles. This segmented model makes it possible for DASH clients to seamlessly switch mid-stream between different quality levels depending on network conditions, and it also makes it possible to deliver live programmes, by progressively making new audio/video/subtitle segments available as they're encoded in real time. For subtitles, we're not really interested in the ability to switch between quality levels; we are, however, very interested in being able to provide subtitles for live programmes that are delivered using MPEG DASH.
For our test stream, we decided that we would deliver the subtitles in segments of 10 seconds in length, so we wrote a tool that would take an EBU-TT-D subtitle file for a whole programme and split it into separate EBU-TT-D subtitle files of a given duration. Once we had used this tool to chunk-up the master Elephants Dream subtitle file into 10 second segments, we then modified one of our existing tools to package these individual EBU-TT-D subtitle files into the ISOBMFF format that is used by MPEG DASH.
Finally, we hosted these DASH ISOBMFF subtitle segments - along with the audio and video segments that we had created for Elephants Dream - on a web server, and created a DASH MPD to describe the stream.
The MPD and encoded segments are publicly available here, and are intended for indicative testing. Such use, without modification, does not require further permission from the BBC.
In order to play our test stream, we needed to create an MPEG DASH client that can understand EBU-TT-D subtitle files and render their subtitles over the video it's presenting. To do this, we added an EBU-TT-D subtitle parser to the GStreamer media framework (which already supports MPEG DASH) and modified GStreamer's existing subtitle rendering element (called textoverlay) so that it could overlay the subtitles output from our parser on video delivered using DASH.
Once the client could render EBU-TT-D subtitle files passed to it directly, we added the ability for it to extract subtitle files from the ISOBMFF packaging in which they're sent in MPEG DASH (this was done by extending GStreamer's qtdemux element). We then had all the components in place to test end-to-end DASH delivery of EBU-TT-D subtitles.
The diagram below shows the complete end-to-end chain:
We're pleased to report that it did indeed work as we'd hoped: the client - running within a Linux virtual machine on a PC - was able to successfully download, unpack and display the subtitles from our test DASH stream (see the top of this post for a photo of the client in action). To the best of our knowledge, this is the first time that end-to-end DASH delivery of EBU-TT-D subtitles has been demonstrated.
So, having shown that we can deliver and present EBU-TT-D subtitles for on-demand (i.e., pre-recorded) content, what are our possible next steps? Two things we’d like to do are…
Demonstrate DASH distribution of EBU-TT-D subtitles for live content.
Our Elephants Dream test stream is an on-demand stream. We had all the subtitles for the entire stream up-front before we started creating the DASH subtitle segments, so creating the DASH segments was a simple matter of splitting this file and packaging the resulting files in ISOBMFF DASH segments.
Providing subtitles for a live programme is, as you can imagine, somewhat more complicated. For a live programme the subtitles are generated in real time by remote subtitlers; these subtitles then need to be communicated over a network to the MPEG DASH packager; there they will need to be accumulated as they arrive and output as EBU-TT-D files of a suitable length for DASH delivery; then they will need to be packaged into ISOBMFF DASH segments and placed on a server. And, of course, for the subtitles to be synchronised with the audio and video at the viewer’s end, this chunking and packaging of the live subtitles will need to be synchronised with the encoding and DASH packaging of the live audio and video, which will be arriving via a different path.
Add support for multiple display regions to our client.
The subtitles in our Elephants Dream test stream are fairly simple, in that they are all displayed within a single on-screen region that has a fixed position towards the bottom of the display. EBU-TT-D, however, supports more complex ways of displaying subtitles; in particular, it allows subtitles to be shown simultaneously in multiple regions of the screen - a feature that is often useful when subtitles need to be moved to avoid obscuring important on-screen action.
At the moment, our client supports rendering subtitles to only a single region of the screen; it would be good, therefore, to extend it to deal with multiple regions, and to create some example DASH content with multi-region subtitles that we, and others, can use to test clients.