The BWAV file format is being extended to add metadata that accurately describes complex multichannel formats.
Project from - present
What we're doing
This is an EBU collaborative project involving members from several European broadcasters and audio related companies. The work in extending the BWAV file format will be incorporated in EBU recommendations (EBU Tech 3285 for BWAV and 3306 for RF64). One development was to add loudness (EBU R128) metadata to the BWAV format. The major current project area is generating a model that allows an accurate and complete description of the format of complex multichannel audio configurations, including channel, scene and object based audio representations.
Why it matters
Currently the exchange of audio files is rather limited to simple channel based arrangements. With audio moving towards immersive, interactive and flexible models, a more sophisticated method of specifying the format of the audio in the files is required. In the past, even with a relatively simple 5.1 surround audio file there was often discrepancies in the order of the channels in a file. You could end up with the centre channel ending up in the left surround speaker for example, which would sound very strange.
To overcome these problems, and to allow much greater flexibility for non-channel based audio configurations, a model is being developed that will give each track in the audio file a description of what it is. For example, we will be able to state that track 2 will contain the "front right" channel, and this will correspond to a speaker position 30 degrees to the right of centre. If we want to carry audio objects we can now give them useful names and metadata describing their position in space and how they move.
With the audio file containing an accurate and rich description of the format of the channels, it then becomes possible for software to automatically and correctly render the audio. Not only will channels be correctly allocated to the speakers, but rendering for binaural, wavefield synthesis, Ambisonic decoding and other presentation formats will be also be possible.
How it works
The BWAV file format can contain several tracks of audio sample that are interleaved. A simple stereo file will contain two tracks, where track 1 is usually the left channel and track 2 the right channel. Instead of assuming a particular order of tracks and their definitions, the new model will give each track a user-defined definition. This definition will be described using XML that conforms to a schema which contains the audio description model. For example, the parameters for an audio channel could be "name", "azimuth", "elevation" and "distance", and these parameters will be assigned suitable values for each channel in the file (such as "Front Left", "-30", "0" and "3.0"). This XML metadata describing all the tracks in the audio file can either be carried as a chunk within the file, or be a separate file for commonly used configurations. The software reading the audio file will then read the XML metadata to determine the description of each track and then process the audio appropriately.
Currently, the EBU project team (FAR-BWF) are working on the audio description model, including working out which parameters are going to be required that can be used to accurately describe audio tracks.
When the project team complete the design the model, XML schema and extension to the BWAV file format, an EBU recommendation will be generated. Audio software and hardware designers will be then encouraged to adopt the recommendation to ensure good interoperability between products. The EBU will also work closely with the AES and ITU in ensuring that it becomes part of an internationally recognised standard.
This project is part of the Immersive and Interactive Content section
This project is part of the Audio Research work stream