Audio on the Web - Knobs and Waves
This is part two of a series of blog posts on how a small BBC R&D team has been rediscovering the era of Radiophonics at the BBC, its sounds and technology, through the contemporary filter of our engagement in the W3C Audio Working Group.
After a couple of weeks of exploring articles and books on the matter, we were feeling ready to marry the sounds of the 1960s with our 2012 technology. Alongside the gunshot effects generator, about which I'd written in the first post of the series, we decided we would recreate a few of the Radiophonic Workshop's equipment with the new web audio standards:
- A set of tape decks - the kind you could use to put together a tape loop from a set of stems;
- A "wobbulator" -- a frequency-modulated oscillator, which today would probably be called more prosaically a "sweep generator";
- ... and a ring modulator, the kind of equipment used to create the voice effects for the Dalek and Cybermen characters in the Dr Who series.
Annotated interface sketches.
And then there was a meaty challenge: recreating the sound of the radiophonic equipment using the emerging APIs being developed in the W3C Audio WG.
What could possibly go wrong? We were only trying to recreate sounds from scattered fragments of knowledge of the actual hardware, building our code on experimental implementations of specifications which we knew would change on an almost daily basis. And indeed, we were almost immediately successful -- if generating a convincingly random imitation of bodily noises can be qualified as success.
BBC R&D Gunshot effects generator (detail)
To understand why our output did not quite sound as it should, we had to resort to a divide-and-conquer strategy. With so many potential points of failure, we would have struggled to figure out whether the problem was with our audio synthesis algorithm, or with our implementation of it, or with our understanding of the draft specification... not to mention that we could simply be hitting a bug in the browser implementation of the specification.
Fortunately for us, we had one well known solid foundation to build upon. When we started work on this project, there were two competing approaches for web audio processing: the Media Stream Processing API and the Web Audio API. The latter was particularly interesting, not only because it was more mature and full-featured than the other one, but because it used the audio routing graph paradigm. This is a model where a number of blocks (nodes) are connected together to define the processing algorithm.
The audio graph was a natural fit for our work, not only because it mirrored the way most of the radiophonic-era hardware was built (blocks and wires), but also because a lot of the audio processing software since the 1980s had been working on such a model, too.
And so Matt decided to use Pure data, an open source graphical audio programming tool he knew well. His ability to use Pd to quickly prototype our audio synthesis graphs was a boon throughout the project, allowing us to refine and validate the audio processing graphs which we expected to produce the sounds of our instruments, and do so in an environment we could trust.
Making a bang with Pure data
That was not always a walk in the park, either. We had wanted to find a project that would stretch the capabilities of the API, and stretch it did.
Matt, Chris and Pete working on the demo interfaces
Our work was about synthesis of audio, as opposed to only audio processing, which most of the API demos so far were showcasing.
The Web Audio API already included a number of native node and built-in methods for a lot of the typical processing one would want to implement: mixing, common filters, spatializing, and so on.
In spite of (or maybe thanks to) all the frustration we had to endure, this meant that we were able to provide a lot of feedback to the audio working group throughout the project, a lot of which quickly made its way into the specification as bug fixes or new features.
In several other instances, our prototyping gave us feedback material for the working group. In addition to the oscillator interface added for audio synthesis, we uncovered some issues with handling of multi-channel audio and problems with processing delays in some specific cases... and as I write this, the question of whether we need to standardise basic operator nodes such as add, subtract, and multiply is approaching a resolution.
Next step: release the demos to the world. But we want to do that right, and give our work not only a "cool" factor, but also make it into proper learning material.
In the next instalment of this series, we will look at how the demos were put together: writing less code, more documentation, and build demos others can learn from - or just play with!