WebAudio -- audio processing in the web browser

The WebAudio API is a way to construct audio scenes that run in the browser, including sound file playback, oscilllators, filters, and effects, and also audio spatialization.

Current support: https://caniuse.com/?search=webaudio -- verified that it works on Quest 3 for example
MDN documentation of the WebAudio API: https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API

Starting an AudioContext

The basic concept is creating an "audio graph", which operates similarly to a suite of audio plugins or a Max patch, in that a number of audio "Nodes" are connected together, with one Node possibly processing the audio output of another Node, etc.

Generally it involves a few steps. First we need an AudioContext as the "scene" container:

const AudioContext = window.AudioContext || window.webkitAudioContext;
const audioContext = new AudioContext();

For security reasons, a WebAudio context cannot start without at least one user interaction on the page, such as a click. Here's one way to do this:

// WebAudio requires a click to start audio:
document.body.onclick = () => {
    audioContext.resume();
};

Or, for a more complex process of setup, we might want a "start button" in the HTML page, and we can respond like so:

const startButton = document.getElementById('startButton');
startButton.addEventListener( 'click', function init() {
    audioContext.resume();
});

Adding AudioNodes to the graph

Now we can create AudioNodes of many different kinds, and connect them together.

// Create a "Gain" node, so that you can set the level of whatever audio passes through it
const outputGainNode = context.createGain();
outputGainNode.gain.value = 0.5;
// connect this to the output destination of the audio context:
outputGainNode.connect(audioContext.destination);

// create a simple oscillator:
let myOscillator = audioContext.createOscillator();
myOscillator.frequency.value = 440; // Hz
// connect oscillator to our outputGainNode
myOscillator.connect(outputGainNode);
// start it playing
myOscillator.start();

// can also stop():
myOscillator.stop(audioContext.currentTime + 2); // Stop after two seconds

Note that the src.connect(dst) call can also specify specific input and output channels, like this src.connect(dst, src_channel, dst_channel).

An output can be connected to more than one input, and vice versa, several outputs can connect to one input (and in that case, the signals will be added together at that input). You can even patch feedback loops -- even patching an output of a node to its own inputs!

We can also disconnect nodes like this: src.disconnect() or src.disconnect(dst) or src.disconnect(dst, src_channel, dst_channel).

When we set the gainNode's gain or the oscillator's frequency, we are actually interacting with AudioParams. We can either directly set them (as we did above with e.g. myOscilllator.frequency.value = 440) or we can schedule changes at specific times relative to AudioContext.currentTime. Or -- we can also use .connect() to set one AudioNode's output to control the AudioParam of another AudioNode. For example, here is a simple frequency-modulation:

let carOsc = audioContext.createOscillator()
carOsc.frequency.value = 440
// some nodes require a .start()
carOsc.start()
// there is also a .stop(), but after that the Node cannot be used anymore. 

let modOsc = audioContext.createOscillator()
modOsc.frequency.value = 440
modOsc.start() 

let modLevel = audioContext.createGain()
modLevel.gain.value = 3000

// wire them up:
modOsc.connect(modLevel) 
modLevel.connect(carOsc.frequency) // controlling an AudioParam
carOsc.connect(audioContext.destination)

AudioParams can either be "a-rate", which means they can update as fast as any audio signal (typically 44100 times per second), or "k-rate", which means they can only update at a lower rate (every 128 samples, which is about every 3ms).

Here are some minimal examples: https://github.com/zenon/MinimalWebAudio

There are also AudioNodes for playing back wave files, for applying filters, delays, waveshaping, and dynamics, etc. We can also create new synthesis routines via AudioWorkletNodes. There is also support for offline rendering, for streaming audio, for deriving waveforms and spectrum analysis from the audio, etc.

Spatialization

https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Web_audio_spatialization_basics

There are many options in here for refining how the distance and directivity of sound is modeled.

Three.js provides built-in support to simplify the process:

const startButton = document.getElementById('startButton');
startButton.addEventListener( 'click', function init() {
    
    // normal Three.js setup, renderer, camera, scene, etc. 

    // create an AudioListener and add it to the camera
    // (this embeds the WebAudio spatialization feature of audioContext.listener)
    const listener = new THREE.AudioListener();
    camera.add( listener );

    // create a spatialized sound node:
    const sound = new THREE.PositionalAudio(listener);
    // attach it to a mesh (or Group, or any other Object3D)
    mesh.add(sound);

    // create an oscilllator for this sound:
    const oscillator = listener.context.createOscillator();
    oscillator.frequency.value = 330;
    oscillator.start()
    // attach to the sound:
    sound.setNodeSource(oscillator)
    // we can also set a volume level here:
    sound.setVolume( 0.5 );
    // we can also configure spatialization
    // e.g. here we set the "reference distance" for attenuation to be 20 meters:
    sound.setRefDistance( 20 );

});

The Three.js AudioListener represents our ears in the virtual world

.context gives you the WebAudio AudioContext
.gain is a global level control WebAudio GainNode for the scene

The PositionalAudio represents a sound localized in space.

It is automatically connected to the AudioListener.
We can install a specific WebAudio AudioNode as the sound source using .setNodeSource()
Alternatively we can tell it to play audio wave data using .setMediaElementSource() or .setBuffer
.panner gives you the WebAudio PannerNode that makes it work
we can set it to play by default using .autoplay = true
it has methods for setting up the way we represent distance and direction. By default it is using an HRTF model, which is good for head-tracked audio in VR/XR.

Here's an example in Stackblitz: https://stackblitz.com/edit/stackblitz-starters-wk77ay?file=module.js

Building WebAudio synthesizers using Max & RNBO

The built-in oscillator/effect etc. types included in WebAudio are somewhat basic. To get more interesting and complex sounds, and generative audio, we typically have to write audio algorithms in WebAssmebly for AudioWorkletNodes -- which is not simple!

However there is a very powerful environment for creating audio synthesis routines called RNBO, which is part of Max/MSP, and it can export patches as new AudioNode "devices".

For some examples of this, see https://learningsynths.ableton.com/

RNBO Web Export: https://rnbo.cycling74.com/learn/the-web-export-target
JS API: https://rnbo.cycling74.com/js?v=1.2.6

First, we design the patch within a RNBO object in Max/MSP. RNBO patchers look very similar to Max patches and have most of the usual objects available -- including gen~ for low-level signal processing!

Select the RNBO object and in the sidebar, select Export, select the JS target, set the export path (where it will export on disk), and press Export (it will take a few seconds to compile the patch). The result is a JSON file that includes all of our synth's process as encoded WebAssembly.

We can now integrate this into a webpage following the template given at https://github.com/Cycling74/rnbo.example.webpage

In the HTML page, import the RNBO library:

<script src="https://cdn.cycling74.com/rnbo/1.2.6/rnbo.min.js"></script>

Make sure that our exported synth JSON files are in the same folder as our webpage. Now add to our javascript code:

// this has to happen in an async function because some of these processes take time to complete:
async function setup() {
    // preload the audio JSON file:
    let patchExportURL = "patch.export.json" // or whatever we named it
    let response = await fetch(patchExportURL);
    let patcher = await response.json();

    // create a device out of our patcher:
    let device = await RNBO.createDevice({ context, patcher });
    // we can get the AudioNode as `device.node`

    // we can now attach this device to an audioContext as before:
    device.node.connect(audioContext.destiation)
    // or for Three.js:
    sound.setNodeSource(device.node)
}

A minimal example using a couple of RNBO devices in a webpage: https://stackblitz.com/edit/vitejs-vite-it8piu?file=README.md

Another example: https://stackblitz.com/edit/vitejs-vite-x5zs89?file=app.js