BBC R&D

Posted by Chris Pike on , last updated

Lately there has been a bit of a buzz about binaural sound. Early on Saturday morning, Rob da Bank broadcast a 3D Headphone Special on BBC Radio 1 (see the trailer below). In this show he used binaural microphones to capture and share his journey from the Isle of Wight to Broadcasting House. There was also a Maida Vale session with Lucy Rose, recorded using a dummy head microphone (known as Fritz to his friends). I thought I'd take this opportunity to share what we know about this subject and update you on some of our own work in this area.

Here's a link to the trailer for Rob Da Bank's 3D Headphone Special.

So what is binaural sound? Those of you who read this blog regularly may already be familiar with the concept, as we've discussed some binaural experiments we've done at the BBC in the last couple of years. Briefly it is a sound production technique that mimics the natural hearing cues created by our head and ears to create the impression of 3D sound when listening on headphones. This is as opposed to listening to stereo sound on headphones, as we currently do, which leads to the impression that sounds are all inside your head. This FAQ page from one of our previous experiments gives a bit more detail about binaural.

Binaural sound itself is not a new idea, public performances were transmitted in some form of binaural stereo as far back at 1881. In the early 1970s dummy head microphones, like Fritz and the one pictured below, became commercially available. These are basically human manikins with microphones placed where the eardrums should be. The BBC made several pioneering radio dramas using these microphones at the time. Most notably The Revenge made in 1978. It was a 20 minute play written and performed by Andrew Sachs without a singnle word of dialogue. This programme is often played on BBC Radio 4 Extra, so look out for upcoming repeats. Since then there have been occasional experiments with these techniques in BBC programmes. In 2008 the Radio 4 documentary Bravo November, about a Chinook helicopter in the Falklands War, used dummy head recordings from inside the helicopter. The brilliantly imaginative interactive drama The Dark House, broadcast in 2002, used miniature microphones placed in the ears of the actors, so you could listen from each of their perspectives.

 

R&D's head and torso simulator microphone

R&D's head and torso simulator microphone

So why aren't all BBC programmes available in binaural sound? These recording techniques can create great immersive effects when the right demonstrations are used. Rob da Bank showed some of the classic examples such as the virtual haircut which can be quite spooky to listen to. But often the quality of binaural recordings is not yet good enough. It is very difficult to create the impression of sounds coming from in front of your head, partly because of conflicting visual information and partly because of a mismatch between the shape of the dummy head and the shape of your own, which means that the auditory cues are not perfect. This mismatch of cues also affects the tonal quality of binaural recordings.

There are also practical issues with dummy head recording. They are not the most portable or inconspicuous of microphones. Another issue is that binaural sound cannot be reproduced over loudspeakers without special processing techniques, which are still developing. In 2011 there was a feature on the Today programme about a scientist at Princeton University who is one of the many people working in this area. (That piece inspired Today's April Fools joke that year, where Evan Davies tried to get the nation to put their hands in front of their face to block out a noise signal, much to our amusement.) Thanks to Internet distribution of programmes, it is now more feasible that different versions of a programme could be delivered, depending on whether headphones or loudspeakers are being used. It makes sense to develop production techniques that are independent of the reproduction method.

Our recent experiments with binaural sound have used a different technique which I'll call headphone surround. This takes surround sound material produced for loudspeaker playback and using binaural techniques creates virtual loudspeakers around the listener. Instead of playing each programme over loudspeakers and recording this using a dummy head, a set of measurements (known as HRTFs) can be made and used in software to create headphone surround from any 5.1 audio signal.

Headphone surround has the advantage that existing programmes with 5.1 audio could be listened to with headphones without losing the surround sound impression. We have previously shared a radio drama and a carol concert that were made using this technique. Radio France have just launched a new website called NouvOson, which hosts a collection of programmes that contain innovative sound. Much of the content is offered in a binaural format, also made in this way. But why stop there? The use of HRTFs allows you to create a sound source in any direction, so it is feasible to go beyond established loudspeaker formats and create a rich 3D scene. We have previously discussed the concept of object-based audio, which potentially allows any reproduction method including 3D binaural sound.

There are still many open problems with binaural sound though. I've touched on the issue of incorrect cues caused by differing head and ear shapes between the recording and listener. If the HRTFs for the listener are known then binaural sound can be created specifically for them. This is possible in a research lab now and one day may be available for everyone through projects such as LocaPhoto. Work is underway to develop a standard file format for this kind of data.

Another issue is head movement. For a convincing impression of sounds coming from outside-of-the-head, they should not move with your head but should stay fixed in place. This is only possible if a system can monitor your head movement and compensate for it dynamically. Again this is easily done in a lab these days but we do not yet have widely available technology to do this for BBC audiences.

There have already been a few mobile apps that have used binaural sound. Recently I worked with Neil Cullen, an MSc student at the University of Salford, to build a prototype that creates dynamic headphone surround on a mobile. It used a relatively cheap Bluetooth head tracking device that attached to the headphones. The quality and affordability of mobile technology for audio processing and head tracking will improve over the next few years, so this may be feasible in the future.

Dynamic headphone surround sound on an Android tablet

When head tracking is used, a dummy head recording no longer adequate. In Beck's recent reinterpretation of David Bowie's Sound and Vision, which last week was released online with interactive 360˚video and binaural audio, a nightmarish head with eight ears was created to give four different perspectives depending on the chosen orientation of the video. Other approaches exist using microphone arrays and applying HRTFs to the captured signals, allowing free head rotation and individual HRTFs to be used. This is a hot area of research at the moment.

Last summer we evaluated a range of headphone surround systems in a listening test. Participants listened to a range of BBC programme clips on headphones, comparing these systems to a stereo down-mix of the 5.1 soundtrack, which is what we currently hear, and graded the sound quality of each. The test included a system that used individual HRTF measurements of the listener and dynamic head tracking. The results, which will be presented at the next AES Convention, showed that there are more improvements to be made before high quality binaural broadcasting is possible.

With colleagues in the EBU we are organising a workshop on binaural audio in May, where we hope to discuss the opportunities and challenges with the broadcast community. Meanwhile we will continue to explore the potential of this technology to create immersive headphone sound for BBC programmes.

This post is part of the Immersive and Interactive Content section