Posted by Stephen Perrott on , last updated
In the Broadcast and Connected Systems team here in R&D one of the things we have been looking at for a few years is adaptive bitrate technology. Recently the main focus in this area has been MPEG DASH, what profile of it to use, and how we can create content using it. The reason for today’s post is to introduce the first piece of test material we have created.
What is DASH?
DASH is the name used to refer to the MPEG Dynamic Adaptive Streaming over HTTP specification – sometimes snappily referred to as ISO/IEC 23009-1. It is an adaptive bitrate system, which can be used for both live streaming and on-demand content.
The media is encoded a number of times at different bitrates. Each encoding is called a Representation. These are split into a number of Media Segments. The client plays a programme by requesting segments, in order, from a representation over HTTP. Representations can be grouped into Adaptation Sets of representations containing equivalent content. If the client wishes to change bitrate it can pick an alternative from the current adaption set and start requesting segments from that representation. Content is encoded in such a way to make this switching easy for the client to do. In addition to a number of media segments, a representation generally also has an Initialisation Segment. This can be thought of as a header, containing information about the encoding, frame sizes, etc. A client needs to obtain this for a given representation before consuming media segments from that representation.
Finally, there is also a Media Presentation Description (MPD), commonly referred to as the manifest. This documents the Adaptation Sets and Representations, together with durations and URLs.
As a specification, DASH covers a wide range of features and media formats. However these are restricted into more manageable sets by profiles. The profile that is of most interest to us is the “ISO Base media file format live profile”. Although it has “live” explicitly in the name this supports on-demand as well – the on-demand profiles on the other hand don’t support live.
In this profile, Initialisation Segments and Media Segments are based on a fragmented ISO Base Media File Format. This is the format which underlies the MP4 file format. It’s worth noting that while fragmented files have always been supported by the BMFF specification, their use has only become common with the advent of adaptive streaming technologies.
The “ISO Base media file format live profile” is not precise enough in itself to allow people to build clients and content creation tools that will work together. Consequently, many industry groups have been working to agree ways of using MPEG DASH. These include the Open IPTV Forum, HbbTV, the UK Digital TV Group and the DASH industry forum.
These groups have faced challenges in trying to minimise the complexity of clients whilst still ensuring that it is practical to create compatible content.
One of the areas of difficulty has been in the way clients need to switch between Representations. Client implementations are made more complex if a new Initialisation Segment needs to be processed when switching between Representations. One possible way of addressing this would be to require the Initialisation Segment to be common across all Representations in an Adaptation Set. However, this creates a significant problem for content providers because the Initialisation Segment would traditionally contain parts of the AVC video stream (known as the Sequence Parameter Set and Picture Parameter Set) that will vary between Representations.
Thankfully MPEG are currently making a specification change which allows the SPS and PPS to be carried in the Media Segments. They get prepended to every random access point in the media. To indicate this change the decoder configuration in the Movie box (in the Initialisation Segment) is carried in a Sample Entry marked “avc3” instead of the traditional “avc1”.
In a client, the SPS and PPS have to be fed into the decoder at stream start up and whenever they change. So, having them included in random access points reduces the complexity of the client.
For a broadcaster, it’s important to be able to deliver one media stream to a large range of devices and be confident that all will be able to present the content. MPEG DASH offers the potential to become a widely implemented interoperable standard but its complexity creates the risk that some implementations may differ from others. With that in mind we have produced some test content to allow manufacturers to check new products will be able to play our content.
As this content is intended to be delivered over the internet there is a potential for incorrect client behaviour to significantly affect the resilience of our services, both to those clients and for others, and to increase the costs we incur in serving them. Consequently it is even more important to ensure these clients are fully tested than is the case for traditional broadcast receivers.
Our first set of test streams
The first test we are making available uses the avc3 entry and in-band SPS and PPS. This uses the well known Big Buck Bunny animation as the source material.
In addition to this we have also created a variant using the avc1 entry, with the SPS and PPS stored in the Initialisation Segment. This stream may be useful in checking whether any problem a client has presenting the avc3 stream is due to the avc3 formatting. However, please note that we do not currently expect to use avc1 MPEG DASH streams in practice due to the difficulties some implementations will have in processing the representation-specific initialisation segments.
These streams only include a subset of the video frame sizes and bitrates that we are likely to use.
We will shortly be adding further tests. The next in line is a computer generated stream which, whilst fairly simplistic in appearance, makes it easy to check a client is correctly displaying the media and is able to switch between representations without introducing any glitches. This will include more frame sizes and bitrates as well as interlaced video.
In addition to media playback there are a number of other areas of client behaviour which it is desirable to test. These are particularly related to handling error conditions and, in the case of live streaming, correct start up, timekeeping and programme end behaviour. The DASH specification contains many features which are very important to broadcasters. However if the behaviour of clients is not consistent then we may not be able to use the full capabilities of DASH which could then adversely affect the perceived reliability of our streams and our costs in serving them.