Visual Seeking for iPlayer
Daniel Pope, Gokhan Urgun and Andrew Wheat
Media Services, BBC Design + Engineering
In our commitment to providing quality of service and ease of use, we launched the 'Visual Seeking' feature across the BBC websites. This post outlines some of the design decisions that were made to address technical and UX concerns.
A while back we added our visual seeking feature for videos that play across the BBC websites, such as News, Sport and iPlayer. This required close collaboration between Media Services (who develop Video Factory, encoding media content for the Web), BBC Research and Development (R&D), the BBC Online Technology Group (OTG) and Media Playout (who develop SMP - the Standard Media Player used across the BBC website).
Visual seeking (aka Thumbnail Scrubbing) is where a small preview picture of the video appears when you move your mouse over the scrub bar. For example, let’s say you wanted to skip to a particular celebrity in Strictly: you can now quickly find the right part by hovering over the scrub bar.
We turned on the back end services a month before the feature went visible to the public, so most programmes had thumbnails available at launch. We later ran a backfill job to generate thumbnails for extended availability content.
We initially launched the feature on desktop and tablets that use a responsive web player. We added the feature later to Sports Apps for mobile devices and recently IPTVs. Other Apps for mobile devices will follow.
There were some challenges to address and consider when delivering the visual seeking feature to achieve high perceptible quality, low latency, efficient broadband use and supporting wide range of playback devices with different requirements.
The rest of the blog post will give you details of the solutions and the trade-offs that were considered.
One way to enable thumbnail scrubbing could be to download the entire video in low quality, then generate the correct frame by seeking the low quality video to the correct place. Decoding video in this way is a particularly CPU intensive process, so instead we opted to generate thumbnails as a collection of JPEGs ahead of time for the media player to consume.
Let’s say a video contains 500 thumbnails. To download these images, the player could make 500 separate requests for one thumbnail at a time. The problem with this is that each request has an overhead, a delay before data starts arriving. Poor network conditions make this delay significant, and browsers will limit the number of simultaneous requests that can be made, resulting in poor performance.
We can improve the efficiency of acquiring the images by packing the individual thumbnails into storyboards. A storyboard is a 5x5 arrangement of thumbnails packed into a single image file. This reduces the number of requests that the player needs to make by a factor of 25 rather than making 500 separate requests for images, the player only has to make 20. The player then treats this storyboard as a sprite sheet, displaying only one thumbnail at a time.
An example storyboard of 25 thumbnails
The player downloads storyboards ‘on demand’ to make efficient use of the broadband connection. When the user hovers over the scrub bar, only its related storyboard and two adjacent neighbours get downloaded. Because the storyboards are downloaded in this way, we can afford to produce a greater number of thumbnails for a video to enable a very fine level of scrubbing. Move your mouse one pixel to the right and you’ll get a new thumbnail. This helps to make seeking more precise, which is important for longer programmes.
Packing thumbnails into storyboards has some pitfalls of its own. The thumbnails themselves compress much better using the JPEG format rather than PNG or GIF. Given this, we decided to use JPEG as we get the same visual quality but for a smaller file. The JPEG format breaks the image into 8x8 pixel blocks, each of which are compressed separately. To provide good quality for the compression used, we needed to pick thumbnail sizes where the edges of the thumbnails fall on the boundaries of these blocks to limit 'bleeding' between adjacent thumbnails. BBC R&D recommended three different thumbnail sizes to cater for different screen sizes. The sizes were picked to reduce this bleeding as much as possible: joins between columns line up perfectly, but the joins between rows don't always align with the blocks. Where there is misalignment, the frames will be seen to have a fuzzy edge as seen in the example below.
Compare the fuzzy boundary between thumbnails 2 and 4 with the sharp boundary between thumbnails 3 and 4. The content of adjacent thumbnails affects the image where the 8x8 blocks do not align with the thumbnails.
Another big challenge was the absence of standardisation for media representation of the thumbnails for video in online video industry. BBC R&D provided us a library they developed which is based on streaming clips which is strategically better but not supported on TV version of iPlayer. Thus, we couldn’t use it. BBC R&D and Media Services collaborated on producing a form of media presentation description for the thumbnails of a video based on the MPEG-DASH industry standard. Being able to describe the thumbnails as media is particularly beneficial when used with the storyboards to structure the description of thousands of thumbnails for video.
Thumbnail scrubbing media presentation description data structure
The extension to the standard also allows us to describe and structure varying number, size and layout of the thumbnails for the same video. To do so, we preferred using separate adaptation sets for each storyboard with different characteristics to reduce the restrictions of the media description to build a future proof foundation.
Thumbnail scrubbing media presentation description file
The thumbnail media presentation description extension was communicated with the MPEG-DASH community, which led to recognition of thumbnail support in media presentation description hence became part of MPEG-DASH industry standard with DASHIF V4.1.
Another Brick in the Wall (of microservices)
Video Factory is the product that deals with the conversion of audio and video for publishing to the BBC sites including iPlayer, Sport and News. It is built using microservices - small pieces of software each with a single purpose, one step in the much larger workflow.
Using this architecture pattern, when we want to add a new feature we just need to create a new microservice and add it to the workflow. This is exactly what we've done for Visual Seeking.
We encode very high quality source video and produce a variety of different bitrates to support playback and streaming with over 1000 different devices, including PCs, phones, tablets, games consoles, and smart TVs. These individual bitrate videos are then distributed and published. We included the thumbnail feature by using the encoder output for a specific bitrate and creating thumbnail storyboards and media presentation description manifest file which are then distributed and published.
The thumbnail scrubbing feature in Video Factory
By adding the new feature in this way, we were able to make the change without affecting any of the original services. Thumbnail generation and the publication of videos work independently of each other, meaning that there is no extra delay in the videos being published; and if the thumbnail service were to fail, the videos would still appear on iPlayer.
The BBC thumbnails are originated by Radix and distributed by CDNs
The thumbnails are distributed through a high performance and resilient content delivery architectural solution that also provides authentication and content routing. Radix is a web cache that originates the BBC thumbnails as well as the BBC on-demand content that is distributed by CDNs.
It’s all about compromise
We had a number of design decisions to make, and ultimately they were decided by compromising between quality and efficiency. We made the decisions considering the trade-offs of each option after some measurements based on the following facts:
- Increasing the quality of the image leads to an increase in the storyboard file size
- The bigger the file, the longer it will take to download
- The perceived quality of a small image doesn’t change after a certain quality level.
- An increase in the number of thumbnails in a storyboard also increases the file size but reduces the number of requests that the player needs to make to be able to get thumbnails for video.
We made a target to produce storyboards at approximately 100KB each. Storyboards of this size can be downloaded very quickly on most connections, and they are of reasonable quality. Devices on slower connections will benefit from progressive JPEG loading, meaning that the full image may appear in low resolution before it has finished downloading.
Even though the frame on the left has only loaded 25% of the image data, the entire image is visible and the context is still discernible.
The file size requirement meant that there is a noticeable loss in image quality, but the effects have been reduced by picking thumbnail sizes that play nicely with the JPEG format.
Our microservice creates storyboards from an input video. We could have used the highest quality output from the encoder to produce storyboards, but using such a high quality input would require a powerful server and a lot of time. Instead we can produce essentially the same output with one of the lower quality videos from the encoder: this vastly reduces the amount of resources required (therefore smaller operating costs) for an almost imperceptible difference in quality. The bitrate we’ve chosen takes under 2 minutes to produce thumbnails for an hour-long video.
The introduction of visual seeking could mean that people might accidentally see a key moment of an episode’s plot as they move their mouse towards the full screen button. This is why we have introduced a short delay between the user hovering over the scrub bar and the thumbnails appearing on our media player. We hope you find Visual Seeking useful. We welcome feedback and we’ll try to answer any questions you may have.
We are hiring!
Video Factory is one of the products that the Media Services team are working on at the BBC. There are new features and media technologies we use to improve our services. Visual Seeking is merely one of them. If you like the idea of working with video/audio encoding, publication and live and on-demand streaming services and want to play an important part in the future of online media, you can apply for the job openings here when available.