Audio Synthesis in JavaScript

By Keith Peters

Yes, you can create sound in JavaScript – at least, in some browsers. Supposedly, it works in Chrome 14, Firefox 23, Opera 15 and Safari 6 but not IE (Note: for this post, I’ve only tested in Chrome). So for now, playing with creating sound in JavaScript should be considered something experimental but it is definitely fun to play with, just not for your super, awesome, must-work-in-every-browser game.

I found several sites that had build up complex libraries based on the Web Audio API. These weren’t the greatest things to learn the basics from, but I eventually was able to pare some of the code down to the bare minimum needed to create sounds in the browser. There’s also the MDN documentation, which is a great reference, but not a step-by-step tutorial for creating sounds. There are a few tutorials linked to from there as well, but none really covered what I was interested in, which is why I created this tutorial.

So, to get down to it, let’s create some noise.

AudioContext

First we need to create an AudioContext object. This is similar in concept to HTML5′s 2D graphics context for canvas. The context is the overarching object that we’ll use to create all the pieces that will create the sound you’re going to make. For Webkit-based browsers, we get an AudioContext like so:

var context = 
new window.webkitAudioContext();

The AudioContext has a few properties, the most important one being destination. The destination is basically the output of the context – where the sound goes. You can think of it as your speakers.

AudioSynthesis1

Creating Audio Nodes

The next thing we need to know about the Web Audio API is that it is a node-based system. By this I mean that we use the AudioContext to create various nodes that are used to create and shape sounds. Nodes have inputs and outputs that we can use to hook various nodes together into different configurations.

The most direct way to create a sound is to create an oscillator node. An oscillator node has zero inputs and one output. We can hook that output to the destination of your context. We’ll also need to specify a frequency for the oscillator. For example, 440 hz will create the musical note, A. Here’s the code:

var context = new window.webkitAudioContext();

var osc = context.createOscillator();
osc.frequency.value = 440;
osc.connect(context.destination);
osc.start(0);

And here’s how this looks from a node view:

AudioSynthesis2

(Yes, the image says “frequency: 500”. My goof. )

We have an oscillator node with a frequency of 440 connected to the destination of the AudioContext. If we call start(0), we should get an annoying sound coming out of our speaker.

The oscillator node has a couple of other properties. One is type. This is the type of wave it uses to generate the sound. It defaults to “sine”. But, you can try “square”, “sawtooth” or “triangle” and see how they sound by doing this:

osc.type = "sawtooth";

There’s also a “custom” type, but that involves creating and setting a custom wave table. If you’re into that, go for it.

Creating an Oscillator

Anyway, wasn’t that easy? Let’s expand on it and create another oscillator that messes with the first one.

To do this, you’ll create two new nodes, an oscillator node and a gain node. A gain node is usually used to change the volume of a sound, but we’ll be using it here to alter the frequency of the original oscillator node. We’ll also create another, slower oscillator node. This new oscillator node’s output will be connected to the gain node. A gain node has a single input and a single output. As the new oscillator goes up and down at a frequency of 1 hz (once per second), it will affect the output of the gain node. A gain node also has a value property. If you set that to 100, then the gain node’s output will cycle from +100 to -100 as the new oscillator slowly cycles.

Now we need to hook this +/- 100 gain node’s output to the original oscillator. Remember that oscillator nodes don’t have any inputs, so we can’t connect it directly to the oscillator node. Instead, we connect it directly to the frequency property of the node. Now that gain output will change the original frequency of 440 hz + or – 100 hz as it cycles. An illustration should make this clearer.

AudioSynthesis3

One oscillator is connected to the gain node, which is directly connected to the other oscillator’s frequency. That oscillator is connected to the destination. Here’s the code:

var context = new window.webkitAudioContext();

var osc = context.createOscillator();
osc.frequency.value = 440;
osc.connect(context.destination);
osc.start(0);

var gain = context.createGain();
gain.gain.value = 100;
gain.connect(osc.frequency);

var osc2 = context.createOscillator();
osc2.frequency.value = 1;
osc2.connect(gain);
osc2.start(0);

Run that and you should hear a siren like sound.

We can also change the type of the second oscillator. Try making that a square wave:

osc2.type = "square";

AudioSynthesis4

Now, instead of cycling smoothly from +100 to -100, the gain’s output will be exactly +100 for half the cycle than jump to exactly -100 for the second half. The result is more like a siren you’d hear in Europe or the UK. It also drives my dog crazy.

Creating Interactivity

One of the simplest and most dynamic ways to capture interactivity on a computer is by simply reading the mouse position. The strategy for this next experiment is to use the mouse’s y-position to control the frequency of a single oscillator. The code for this is super simple:

window.onload = function() {

    var context = new window.webkitAudioContext(),
        osc = context.createOscillator(),
        h = window.innerHeight;

    osc.connect(context.destination);
    osc.start(0);

    document.addEventListener("mousemove", function(e) {
        osc.frequency.value = e.clientY / h * 1000 + 300;
    });
};

Create an AudioContext and an oscillator, get the window dimensions, connect and start the oscillator and listen for mousemove events. In the mousemove handler, e.clientY / h will be a number from 0 to 1. Multiply this by 1000 and add 300 and we’ll have a frequency from 300 to 1300. This gets assigned to the oscillator’s frequency value. Move your mouse around the screen and get different pitches. Simple.

Remember, the above has only been tested in the latest version of Chrome at the time of this writing. Other configurations may work, but some may require changes.

Now what about they x-axis? In our earlier example, we had two oscillators going. Let’s try to hook the mouse’s x-position to that second oscillator.

window.onload = function() {

    var context = new window.webkitAudioContext(),
        osc = context.createOscillator(),
        osc2 = context.createOscillator(),
        gain = context.createGain(),
        w = window.innerWidth,
        h = window.innerHeight;

    osc.frequency = 400;

    osc.connect(context.destination);
    osc.start(0);

    gain.gain.value = 100;
    gain.connect(osc.frequency);

    osc2.frequency.value = 5;
    osc2.connect(gain);
    osc2.start(0);

    document.addEventListener("mousemove", function(e) {
        osc.frequency.value = e.clientY / h * 1000 + 200;
        osc2.frequency.value = e.clientX / w * 30 + 5;
    });
};

This is much the same as the final code from earlier, but now you’re using the mouse x and y positions to control the frequencies of both oscillators. Move your mouse all over the screen now and you’ll get all kinds of “science-fictiony” sounds. I picture a 1950′s flying saucer taking off, or maybe an alien ray gun. Mess with the frequency ranges of both oscillators and try changing the oscillator types for both – mix and match square, sawtooth and triangle waves for all kinds of interesting results.

Creating Music

First of all, I am not a musician by any stretch of the imagination – that fact will become obvious all too soon – but if you’re going to make sound with code, you wind up either making sound effects or music. So let’s start with music. My goal here was to create a simple “tracker” application. You program in notes, it plays those notes back in progression to form a song – of sorts.

I’m going to keep the song very simple: “Mary Had a Little Lamb.” It’s a song that you can play every note with one beat, so we don’t have to mess around with different note lengths. Here’s a simple version of the song transcribed into notes and rests:

g f e f
g g g -
f f f -
g b b -
g f e f 
g g g g
f f g f
e – - -

A quick search on the Internet gives you the frequency values for the notes b, e, f and g, which are the only ones we’ll need for this song. Let’s code that into an object:

scale = {
    g: 392,
    f: 349.23,
    e: 329.63,
    b: 493.88
}

And then we can just code the song as a string.

song = "gfefgg-fff-gbb-gfefggggffgfe---";

Now we can create an AudioContext and an oscillator and set an interval that runs at a certain speed. In the interval callback, we get the next note, find its frequency and set the oscillator’s frequency to that value. Like this:

window.onload = function() {

    var audio = new window.webkitAudioContext(),
        osc = audio.createOscillator(),
        position = 0,
        scale = {
            g: 392,
            f: 349.23,
            e: 329.63,
            b: 493.88
        },
        song = "gfefgg-fff-gbb-gfefggggffgfe---";

        osc.connect(audio.destination);
        osc.start(0);

    setInterval(play, 1000 / 4);

    function play() {
        var note = song.charAt(position),
            freq = scale[note];
        position += 1;
        if(position >= song.length) {
            position = 0;
        }
        if(freq) {
            osc.frequency.value = freq;
        }
    }
};

Now this actually works and you should be able to recognize the melody somewhat. But it leaves a lot to be desired. The biggest thing is that there is no separation between notes. You have a single oscillator running and you’re just changing its frequency. This ends up creating a sort of slide between notes rather than distinct notes. And when there’s a rest, well, there is no rest. It just keeps playing the last note.

Articulation Gaps and Rests

There are various ways to try to handle this. One would be to call stop() on the oscillator, then change its frequency, then call start() again. But, when you read the documentation, it turns out that these are one time operations on an oscillator. Once we call stop(), it’s done. That particular oscillator cannot be restarted. So what to do?

The suggested answer is actually to create a new oscillator for each note. Initially, this sounds like a horrible idea. Create and destroy a new object for every single note in the song??? Well, it turns out that it’s not so bad. There are some frameworks that create a sort of object pool of notes in the background and reuse them. But the downside to that is that every note you create and start continues playing even if you can’t hear it. It’s your choice, and I suppose you could do all sorts of profiling to see which is more performant. But for “Mary Had a Little Lamb,” I think we’re safe to create a new oscillator each time.

To do this, make a new function called createOscillator. This will create an oscillator, specify its frequency and start it. After a given time, it will stop and disconnect that oscillator. We can then get rid of the main osc variable in the code and call the createOscillator function when we want to play a note.

window.onload = function() {

    var audio = new window.webkitAudioContext(),
        position = 0,
        scale = {
            g: 392,
            f: 349.23,
            e: 329.63,
            b: 493.88
        },
        song = "gfefgg-fff-gbb-gfefggggffgfe---";

    setInterval(play, 1000 / 4);

    function createOscillator(freq) {
        var osc = audio.createOscillator();

        osc.frequency.value = freq;
        osc.type = "square";
        osc.connect(audio.destination);
        osc.start(0);

        setTimeout(function() {
            osc.stop(0);
            osc.disconnect(audio.destination);
        }, 1000 / 4)
    }

    function play() {
        var note = song.charAt(position),
            freq = scale[note];
        position += 1;
        if(position >= song.length) {
            position = 0;
        }
        if(freq) {
            createOscillator(freq);
        }
    }
};

This sounds better already. Each note is distinct, and when there is a rest, no note plays. But we can do even better.

Sound Envelopes

The notes are just flat tones at this point. You can improve this by giving them a quick and dirty sound envelope. In real life, a sound doesn’t usually just start at full volume and cut out to silence when its time is up. The volume usually ramps up a bit first (the attack) and fades out quickly or slowly (the decay).

Actually, sound envelopes can be much more complex than this, but a simple attack and decay to 0 will do fine for now and give the notes a much better sound.

As a sound envelope controls the volume of a sound, we’ll need to create a gain node. Do this inside the createOscillator function:

function createOscillator(freq) {
    var attack = 10,
        decay = 250,
        gain = audio.createGain(),
        osc = audio.createOscillator();
...

The attack and decay values there are in milliseconds. This means that the sound will ramp up from 0 to full volume in 0.01 seconds and then back down to 0 in .25 seconds.

Next, the gain node will need to go between the oscillator’s output and the destination in order to control the volume of that oscillator.

gain.connect(audio.destination);
gain.gain.setValueAtTime(0, audio.currentTime);
gain.gain.linearRampToValueAtTime(1, audio.currentTime + attack / 1000);
gain.gain.linearRampToValueAtTime(0, audio.currentTime + decay / 1000);
...

First we set the gain to 0, which in essence sets the volume to 0. We do this with the setValueAtTime method, passing in the current time from the AudioContext object. In other words, set the volume to 0, now.

Then we use linearRampToValueAtTime to set the attack and decay. This says go from whatever value you are currently at and interpolate to this new value so that you arrive there at the specified time. Note that the values you pass in here are in seconds, so we’ll need to divide by 1000 when using millisecond values.

Finally, we connect the oscillator to the gain, set the frequency and start it. And of course, when we clean up, we’ll need to disconnect everything as well.

osc.frequency.value = freq;
    osc.type = "square";
    osc.connect(gain);
    osc.start(0);

    setTimeout(function() {
        osc.stop(0);
        osc.disconnect(gain);
        gain.disconnect(audio.destination);
    }, decay)
}

The final code is below:

window.onload = function() {

    var audio = new window.webkitAudioContext(),
        position = 0,
        scale = {
            g: 392,
            f: 349.23,
            e: 329.63,
            b: 493.88
        },
        song = "gfefgg-fff-gbb-gfefggggffgfe---";

    setInterval(play, 1000 / 4);

    function createOscillator(freq) {
        var attack = 10,
            decay = 250,
            gain = audio.createGain(),
            osc = audio.createOscillator();

        gain.connect(audio.destination);
        gain.gain.setValueAtTime(0, audio.currentTime);
        gain.gain.linearRampToValueAtTime(1, audio.currentTime + attack / 1000);
        gain.gain.linearRampToValueAtTime(0, audio.currentTime + decay / 1000);

        osc.frequency.value = freq;
        osc.type = "square";
        osc.connect(gain);
        osc.start(0);

        setTimeout(function() {
            osc.stop(0);
            osc.disconnect(gain);
            gain.disconnect(audio.destination);
        }, decay)
    }

    function play() {
        var note = song.charAt(position),
            freq = scale[note];
        position += 1;
        if(position >= song.length) {
            position = 0;
        }
        if(freq) {
            createOscillator(freq);
        }
    }
};

Now the notes have a more bell-like sound. It’s not awesome, but it’s a whole lot better. Mess around with trying to create different envelopes to see how that changes the sound. Code in your own song strings too. Don’t forget to add any additional note frequencies that you might use. You might even want to do something with allowing more than one-beat notes, though that starts to get a bit more complex.

Where To Go From Here

The examples here have been pretty rough and dirty code. A whole lot that could be improved upon, but it’s good for demonstration purposes and hopefully gives you a better understanding of the underlying fundamentals of creating sound with JavaScript than a more complex structure would. Have fun with it.

This article was originally published as a series at https://www.bit-101.com/blog/?p=3875, https://www.bit-101.com/blog/?p=3893 and https://www.bit-101.com/blog/?p=3896

Previous

Building a Blog with Jekyll

Creating an Animated Hero Image with Canvas

Next

13 thoughts on “Audio Synthesis in JavaScript”

  1. If anyone is interested in a more robust example of playing music in the browser, you can checkout Band.js – https://github.com/meenie/band.js

    Brian did an article about it on flippenawesome.org a little while you which can be found here: flippinawesome.org/2013/09/09/retro-game-music-using-web-audio-and-band-js

  2. Funny detail, if you run the final code in Chrome and switch to another tab, the song will continue to be played in slow motion

  3. Awesome write-up, thanks! Also learned a lot about synthesizers. I noticed that you kept this to Chrome, but actually the only change you need to make is to get rid of the word webkit from window.webkitAudioContext(), so just call it as window.AudioContext() and it works with Firefox at least, and probably some other browsers too.

  4. Thanks for this!
    I tested in firefox 37, and will not work until you change the line
    new window.webkitAudioContext();
    into:
    new window.AudioContext();
    Which works in Chrome aswell.
    pd.Im afraid of putting my email. find me in cl.linkedin.com/in/joaldunate

  5. This is so awesome. Thank you for this. All I want to do is play the first 8 bars or so of some classic songs as a theme motif, and this is perfect. As a side note, you say this article is over two years old (now four), but I think it’s still very relevant, and I have only two updates to make it fully 2017 compliant. Like Joaquin said below:

    new window.webkitAudioContext();
    becomes
    new window.AudioContext();

    And
    song = “gfefgg-fff-gbb-gfefggggffgfe—“;
    becomes
    song = “gfefggg-fff-gbb-gfefggggffgfe—“;

    Other than that I have tried it all in Chrome with perfect success. There was just a missing ‘g’ (the first “lamb” in the song). Again, thanks, this is awesome, love your voice in the writing and how easy this is to follow.

Comments are closed.