Video compression is an asset that the broadcast industry heavily relies on to deliver content to audiences. With increasing demand for higher resolutions and more immersive content, there is a need to look for new and innovative solutions to produce further advances in compression.
Development of the Versatile Video Coding (VVC) standard began recently, as a venture by the Joint Video Experts Team (JVET). JVET is a collaboration group of industry professionals working towards the goal of producing video compression with a capability beyond VVCs predecessor - H.265/High Efficiency Video Coding (HEVC).
Last year, the Joint Video Experts Team had already achieved 25% bit rate savings over HEVC using an experimental video coding model. Based on these results, the group decided to move forward and have an official call for proposals for VVC which will be their newest standard. VVC is currently in development and is to be finalised in 2020.
The Alliance of Open Media (AOMedia) developed the royalty-free AV1 in 2018, with video streaming in mind. Our previous testing showed that AV1 performed similar to HEVC although at significantly higher encoding and decoding times. This signifies technology that is much more computationally complex. Since its release in 2018, AOMedia have been further developing AV1 and so part of this current experiment was to see the impact a year of work has had on AV1.
For this test, we used the HEVC test model (HM) as a reference for testing, and the two newer technologies were measured against it. AOMedia specified AV1 to support many different tools and techniques and so a configuration had to be chosen that would allow for the fairest comparison between HEVC and VVC. For this reason a single-pass (see below) configuration was chosen. A variety of HD and UHD sequences were chosen to reflect the trend of video content towards higher resolutions.
We used the Bjontegaard-Delta bit rate (BD-Rate) metric to compare the models. This method takes the peak signal to noise ratio (PSNR) and bit rate of the compressed video at 4 different levels to produce a curve of the codec performance. A higher PSNR means the decoded video will be more similar to the original video.
We found that the VVC test model (VTM) performed 27% better than the HEVC model for HD sequences and 35% for UHD sequences. AV1 on the other hand performed very similarly to HEVC, with average 2.5% loss over the HD sequences, and 1.3% gains for the UHD sequences. The graph below shows the average bit rates a compressed video of equal quality would have if encoded with AV1 or VVC relative to HEVC.
The graph below shows the results for one of the UHD sequences. It shows that the VVC test model produces consistently higher peak signal to noise values than HM for an equal bit rate. This means that the decoded VTM video is of better objective quality than the HEVC test model. The graph also shows the curve for AV1, which performs very similarly to HM. What is interesting here is that AV1 produces higher quality video at lower bit rates, but is passed at higher bit rates. This demonstrates that AV1 could produce higher quality decoded video than HEVC in low bit rate scenarios, which is highly desirable in the video coding field.
The time taken to process the videos through the codec is an important measurement also. Increased processing times means increased complexity, and therefore more computational power is required. Encoding video normally occurs at a broadcast or streaming service with the resources to provide the computational power required. However this also could be people at home who want to upload their videos to social media or send to their friends. Reducing the complexity means companies and individuals can save time and money by encoding quicker, on cheaper, simpler devices.
Decoding happens when the video is received. This could be mobile phones, set top boxes in the home, or almost any device that receives and plays video. Decoding should be low complexity and lightweight to keep user devices simple and cheap. This keeps decoding time low and helps to avoid juddering video.
We found that the compression gains from VTM come at the cost of processing time. Encoding takes around 6.5x that of HM and decoding takes 1.5x longer. AV1 on the other hand takes about 4x as long to encode, but is 8% quicker than HM to decode. When we compared this to the test we performed last year, AV1 has hugely improved these processing times. This again highlights the focus of AOMedia of producing a codec optimised for streaming.
It is worth mentioning that the test models used are intended to only provide an insight into the possible quality a codec can achieve, and they are not optimised for speed. Encoding and decoding will typically happen with optimised software or on hardware, where processing times will be far quicker. This data does still give a general idea of the relative quality and complexity these codecs have to one another.
As mentioned before, AV1 can be operated in different manners. The most defining modes are the number of passes used for processing. A single-pass was chosen for our testing above as we wanted to match live broadcasting and streaming scenarios. This also gave us a fairer comparison with the HEVC and VVC standards.
AV1 can also be operated in two-pass mode, where the video is first passed through the encoder which analyses the video to optimise the algorithms used. The video is then passed through again, and this time the actual compression occurs. AOMedia have more heavily optimised AV1 for two-pass mode. AV1 can also be configured to reduce encoding time by enabling ‘speed up’ algorithms.
We wanted to get an overview of how the number of passes and the speed up algorithms affected the performance of AV1. For this, we used the same HD sequences from above, and AV1 in single pass and default speed preset as the reference. It was found that the two pass mode performed slightly better than the single pass mode, with gains of 0.37% on average over the HD sequences. The processing time is reduced in two-pass mode as well – highlighting the optimisation of the configuration. Setting a fast preset reduces the effectiveness of the codec, although offers vastly reduced encode times, as shown in the table below (updated October '19).
This testing provides a snapshot into the current state of the VVC development, which shows the new standard expects to deliver compression gains over HEVC by the time it is finalised in 2020. On the other hand, AV1 has been proven to offer similar coding efficiency to the current HEVC standard, and the processing time appears to be significantly dropping. It is clear though that the video coding community are still producing new technologies to further improve video delivery, allowing much greater content to reach audiences.
October 2019 Update:
Additional information regarding the execution of these tests, including the codec configurations, is provided in this document. It should be noted that this test is not exhaustive, i.e. there are a number of different configurations that can be used in HM, VTM and AV1. We attempted to conduct a fair comparison with other tested codecs, using settings suitable for live broadcasting and streaming scenarios. Accordingly, we mainly focused on a single-pass configuration with appropriate parameter for the 3 codecs tested.
Further research on ongoing developments on the tested codecs revealed several significant improvements in particular in the processing speed of the AV1 codec. This includes accelerations based on machine-learning by Kim et al., and by Su et al.
It is also reported in other performance evaluations, for example by Moscow State University, and Bitmovin which use different coding configurations, in particular in 2-pass mode in AV1 with lookahead feature produces significantly better compression efficiency.
This post is part of the Distribution Core Technologies section