Frame accurate video in HTML5
Hello, I am Dirk-Willem van Gulik, Chief Technical Architect here at the BBC. An important part of my job is to help the BBC use the right internet and web technologies - and help the industry and open standards bodies create the internet and web technologies which are right for the BBC.
Now the BBC is a very special place to work. And one of the main things which makes it so special is "Quality". At the BBC it is a currency, it is a goal, it is a culture - and as an engineer, it is something you are tasked to deliver.
One of our roles in FM&T is to provide our creative colleagues with tools. The tools they need for broadcast and to create high quality video. This includes tools for "non linear editing" - taking short clips, cutting them to the right length, stringing them together, adding some voice overs and graphics - and then endlessly tuning the resulting video so that it tells a story perfectly.
Usually we shoot hundreds of hours of video, import it onto an editing server, painstakingly tagging or "logging" the content on the way, and then edit each clip into something that makes sense. Because the original video files are so huge (especially in HD), we actually edit low resolution "proxy" versions of each file, and we store edit decisions using timecodes rather than actually mashing up the real video all the time. Then everything can be synced up and "conformed" using the original high-quality versions later on.
Throughout all of this, timecodes play a major role. They are the key 'link' to get right. They ensure that recipes done on the proxy give identical (albeit at a higher resolution) results when repeated on the raw high resolution footage at the end. They ensure that the audio tracks are perfectly synchronized with the clips, that transitions start and end at exactly the right time (and there is not some extra black frame due to a rounding error). They are also important in the creative process - as they let us communicate. We can ask each other to look at a specific frame - or discuss whether we move a cut by a few frames to achieve a particular effect.
If this sounds a bit overly perfectionistic and artistic - then consider this - a cut every 3 seconds or so is quite normal. So if you are off by 1 frame either way - then we're already talking errors of over 2%! Even a very pragmatic engineer would have to agree that that matters!
So timecodes using exact frame references are important. Really important. And the dirty little secret is that the internet has none. NONE! None of today's open standard technologies, or even the dominant proprietary ones, do timecodes right. They are off by one; they round to the nearest half second, they jump to the nearest previous I-frame. Whatever. (In all fairness - there are highly specialist products one can buy and install, usually with special browser plugins, which are accurate, often provided they are used with specially prepared material and within a single LAN. But none of those are conductive to the 'internet' network effect by facilitating collaboration between creative people across organisational barriers.)
Because the first thing a professional needs is a rock solid way to reference each and every individual frame accurately. So they can talk about it. For us - 'video on the web' is a bit as in James May's programme - today the internet feels like that plastic 1:1 model of a spitfire. It looks like one - but it sure does not fly.
Now over the past two months that landscape has started to radically change. A few of us have been working with the various open standard and open source HTML5 communities. And as of this week, after 120 emails, the bleeding edge development versions of several HTML5 implementations (as used in Safari, Chrome, Mozilla and many others) are now fully frame accurate.
Really. Frame Accurate. Actually even more accurate than just a frame (which is important for audio). You can jump to any point in the video (i.e. 1 hour, 3 minutes, 6 seconds and 5 frames, or to frame 178127) - and it will be exactly at that frame. Not at the nearest i-frame, rounded down to the nearest second, or off by one. No it will be exactly at that very frame.
So today, the HTML5 community has opened a door for us. Which will allow creative people to collaborate and edit professional video on the web.
Do know though that, while key, this is just a first step. There is a lot to still build, so we'll need many hyper creative companies and internet engineers working together to make this work. We need to create a new breed of web based production tools which can interact at the quality levels professionals and the BBC expect. And we still have issues around UMIDs (unique global references for video) to crack. And even some very basic things (like did you know that a pixel in the video world is actually rectangular, rather than square?!) will need to universally understood between the broadcasting and internet engineers. But boy, getting timecodes, that is a big step!
Again - a big thank you to the open source folks of WebKit and Mozilla. IE9 is not quite there - (progress is tracked at https://connect.microsoft.com/IE/feedback/details/636755) but Microsoft has let us know that we "can expect the video-frame-accurate seeking be available when IE9 is final"!
 To give credit where credit is due: within the BBC, Raymond Le Gué (programme director at BBC) insisted on having frame accurate playback in the browser. Rob Coenen went on and beyond his call of duty to make this happen, filing bugs - patiently working with the wider developer community, explaining what SMPTE frame counts are, why film and television production cannot live without it, proving that it was not working in browsers and helping the developers to fix it. He got help from Bas Schouten (at Mozilla), Andy Armstrong and Dirk-Willem van Gulik (both at the BBC).
But most credit should go to the open standards and open source communities around Webkit, Chrome and Mozilla which made it happen: Andrew Scherkus and the Chromium team get credit for being the first to understand the significance. The actual fixes where ultimately created by Jer Noble, Eric Carlson (both at Apple) and Chrome developer Andrew Scherkus; while Matthew Gregan and Anthony Hughes did the job for Mozilla.
Dirk-Willem van Gulik is Chief Architect, BBC Future Media & Tecnology
SMPTE timecode based and frame accurate metadata logging is now possible over the web with HTML5. The image above is a made up screen shot of what a prototype tool to do this might look like.