Controlling 3D Sound Using Natural Gesture
In BBC R&D’s audio team we are investigating immersive audio formats for the future of broadcasting. A key aim for this work is to create a more realistic sense of space in audio content, allowing listeners to perceive sounds in three dimensions by adding height and depth information.
This work recently featured on the Wired website. That article introduced our recent experiments with gesture-based control to allow sound engineers position sounds in 3D space. Using the Microsoft Kinect we have developed a tool that allows the user to move sounds with natural arm movements.
We’ve made a short video about it.
The main focus of our team’s research is on the processing and perception of sound. However, we also need to understand how production staff work so that we can design tools that let them use new technology intuitively. To make any new broadcast technology viable for the BBC it has to be easy to use both for production teams and audiences.
Added creative freedom often brings new challenges to production. Making programmes with 3D audio content requires different techniques and tools. You may have seen some of our previous posts about trials we’ve done capturing and reproducing 3D audio. In this post we're talking about another part of the challenge, panning.
Panning is the technique used to place a virtual sound source within a scene. Sound engineers use a pan control to set the direction that the sound comes from. Traditional mixing desks have rotary controls for panning, which you turn to rotate the sounds left and right around the listener.
Using natural gesture to interact with sounds is another approach. This is much easier than it used to be thanks to affordable consumer devices, such as the Microsoft Kinect.
You may have heard about the Philharmonic Maestro project, where we used the Kinect to create an interactive musical experience for children. We used the Kinect again in this project to allow sound engineers to place sounds in 3D space simply by pointing their arm in the desired direction. We hoped that this would be more intuitive than twiddling knobs on a desk.
The software prototype was designed to give the user an understanding of the 3D sound scene that they are creating through the graphical interface (shown below). The interface is independent of the spatial audio technology used for sound reproduction. In our lab we can dynamically switch between stereophonic panning, Ambisonics and binaural rendering.
Their feedback has helped us to understand where gesture-based control can be useful, and what we can do to make our tools more intuitive and effective. The concept of 3D sound production is new to our engineers. By working with them as we develop these systems, we can ensure that they know what’s on the horizon and that we know what is needed in the production world.
You can read in more detail about the prototype design and testing in our recent Audio Engineering Society paper. We would like to thank Mike Smith, Paul Cargill, Steve Brooke and Dave Lee for their invaluable feedback.
P.S. This Christmas BBC R&D are working with Radio 3 on a surround sound experiment and we'd like you to participate. There are two ways that you can join in: listening over headphones or listening over speakers. To take part in the experiment click here.
You can find out more about binaural audio and why we are experimenting with it here.