Crafting musical instruments with the Web Audio API

It’s really amazing how much functionality is built into the browser platforms these days. WebGL, Web Push Notifications, WebRTC, View Transitions, and everything else enabling Progressive Web Apps comes to mind. The Lynx browser could never.

But the topic of this post is the Web Audio API which provides primitives for building digital audio applications entirely within the browser! Some folks have gone much, much deeper on the subject such as fellow Brooklynite Yotam Mann (see Tone.js) and Tero Parviainen. The browser APIs tend to be intentionally low-level and verbose so that folks like us can build useful higher-level abstractions over them for various use cases.

For example, this React hook I wrote below is quite nice for getting a feel for the API in a modern JS framework setting. The interplay between the native code and the JS code takes some getting used to as there is some amount of state being maintained in both places. The JS layer is responsible for the configuration of the audio, but the actual adjustments to the audio happen on the native threads. This means that adjustments to the configuration will not cause a re-render of the component using the hook! It’s sort of like working with uncontrolled form inputs.

Each usage of useTone() will create an audio node which can be turned on and off via .play() and .stop(). The oscillator doesn’t actually disconnect or unmount when stop is called. Instead, the volume (“gain”) is just set to 0 while it continues to run in the background, but this is an implementation detail that is abstracted away from the caller of the hook.

interface Tone {
  freq: number;
  volume: number;

  // play basically just turns up the volume
  play: () => void;
  // stop sets the volume to zero
  stop: () => void;

  // advanced users can dig into the audio primitives
  oscillatorNode: OscillatorNode;
  gainNode: GainNode;
}

export interface UseToneOptions {
  // frequency in hertz between 1 and 24000
  freq: number;
  // gain value between 0 and 1
  volume: number;
}

const defaultOptions: UseToneOptions = {
  freq: 440,
  volume: 1,
};

function useTone(options: Partial<UseToneOptions>): Tone {
  const { ctx } = useContext(AppAudioContext);
  const osc = useRef<OscillatorNode>(ctx.createOscillator()).current;
  const gain = useRef<GainNode>(ctx.createGain()).current;

  // derive final configuration
  const opts: UseToneOptions = {
    ...defaultOptions,
    ...options,
  };

  useEffect(() => {
    osc.connect(gain);
    gain.connect(ctx.destination);

    return () => {
      try {
        osc.disconnect(ctx.destination);
        gain.disconnect(ctx.destination);
      } catch (e) {}
    };
  }, []);

  useEffect(() => {
    osc.frequency.setValueAtTime(opts.freq, ctx.currentTime);
  }, [opts.freq]);

  useEffect(() => {
    gain.gain.setValueAtTime(opts.volume, ctx.currentTime);
  }, [opts.volume]);

  const play = () => {
    try {
      osc.start();
    } catch (e) {}
    gain.gain.setValueAtTime(opts.volume, ctx.currentTime);
  };

  const stop = () => {
    gain.gain.setValueAtTime(0, ctx.currentTime);
  };

  return {
    freq: osc.frequency.value,
    volume: gain.gain.value,
    play,
    stop,

    oscillatorNode: osc,
    gainNode: gain,
  };
}

export { useTone };

Next are a few examples of how I used this hook to build a small ensemble of instruments! Unfortunately, they do not work well on mobile (yet).

First, you need to click this big button to allow me to play audio because apparently that is a thing.

Keyboard

This is a simple keyboard with a couple of options. Its only dimension is pitch, but you can change the range of keys as well as the first note. It uses a typical equal temperament. Double click on a key to pin its tone.

Trombone

Like the keyboard, this “trombone” only has a pitch dimension. However, it provides an infinite number of steps between each scalar tone. Because of this feature, the trombone is regarded as one of the closest instruments to the human voice. Each bar represents a partial range based on the harmonic series. When a trombone player tightens their embouchure, they jump up a partial. And of course you can move down the X-axis of each partial as if you were extending the slide of a trombone. Double click on a bar to pin its tone. You can actually pin multiple tones to play chords which I realize is not how a trombone works, but it’s more fun this way.

Theremin

Like the trombone, this “theremin” allows for an infinite number of steps between tones. However, it also allows for adjustments in volume via the Y-axis. A player would use two hands to control these two dimensions: one for pitch and one for volume. The example below simulates this in a 2D space. Double click anywhere to pin the tone.

Conclusion

That’s all I’ve got! I hope you had a fun jam session and didn’t torture your ears too much.

Plenty of next-up opportunities here including:

Some kind of percussion
Ability to record and playback
Ability to feed programmed songs into the instruments
Visualizers of the frequency and/or gain
Re-implement the visuals in WebGL or SVG instead of divs (lol)

Unfortunately, listening to this type of raw data is quite harsh and totally unnatural for our ears. I actually had to train my ears early on to get used to the overtones and timbre or lack thereof. At first, I even struggled to reliably pick out octaves which was driving me mad as I implemented the keyboard. My music theorist roommate tells me this is an active area of research, so who knows? Maybe this API will prove to be useful after all!