Research & Development


Object-based audio promises format-agnostic reproduction and extensive personalization of spatial audio content. However, in practical listening scenarios, such as in consumer audio, ideal reproduction is typically not possible. To maximize the quality of listening experience, a different approach is required, for example modifications of metadata to adjust for the reproduction layout or personalization choices. This paper proposes a novel system architecture for semantically informed rendering (SIR), that combines object audio rendering with high-level processing of object metadata. In many cases, this processing uses novel, advanced metadata describing the objects to optimally adjust the audio scene to the reproduction system or listener preferences. The proposed system is evaluated with several adaptation strategies, including semantically motivated downmix to layouts with few loudspeakers, manipulation of perceptual attributes, perceptual reverberation compensation, and orchestration of mobile devices for immersive reproduction. These examples demonstrate how SIR can significantly improve the media experience and provide advanced personalization controls, for example by maintaining smooth object trajectories on systems with few loudspeakers, or providing personalized envelopment levels. An example implementation of the proposed system architecture is described and provided as an open, extensible software framework that combines object-based audio rendering and high-level processing of advanced object metadata.

This paper was originally published in the Journal of the Audio Engineering Society (vol. 67, no. 7/8), August 2019.

The paper was written in collaboration with the following authors: Andreas Franck, Dylan Menzies, and Filippo M. Fazi (University of Southampton); James Woodcock, Richard J. Hughes, and Trevor J. Cox (University of Salford); and Philip Coleman and Philip J.B. Jackson (University of Surrey). This work was supported by the EPSRC Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at Home (EP/L000539/1).