Saturday, 11 May 2013

æ - ʊ - ʌ - i - a - ɜ

Formant Synthesis

I'm currently working on quite a large project which brings together the ideas from the two previous posts, so I thought this would serve as an interesting sound design interlude until I get that finished.
Formant: (Acoustic Phonetics) One of the regions of concentration of energy, prominent on a sound spectrogram, that collectively constitute the frequency spectrum of a speech sound. The relative positioning of the first and second formants, whether periodic or aperiodic, as of the o of hope at approximately 500 and 900 cycles per second, is usually sufficient to distinguish a sound from all others.

That definition of formant points to some interesting information about how we listen and communicate. This is potentially very useful for the purposes of sound design; different vocal sounds are constructed from combinations of formants at different frequencies. This is one of the key factors which allows us to distinguish between different vocal sounds and words, an attribute which has developed organically with our ability to communicate. Whilst speech synthesis is the most obvious area this is useful for, a potentially interesting task for a sound designer is creating vocalisations for fictional creatures (for example in fantasy or science fiction genres), and it can be assumed that if they are organic and have developed as we have, then similar rules will apply. 

Here is some interesting reading on constructed language (conlang) and creature sound design courtesy of Darren Blondin.

Formant Synthesiser

So here is a device for formant synthesis built in Max/MSP. This is heavily based on a patch from Andy Farnell's excellent book Designing Sound, so all credit for the basic design goes to him. If you're interested in real-time synthesis of non-musical sounds there is (to my knowledge) no better book. There is an introductory chapter to the book available as a free PDF, which also makes a great introduction to pd. Even if you intend to do all your patching in Max, the ideas and patches from the book are easily transferable.

Farnell's example patch "Schwa box" is built in pd (as are all those in the book), so I've adapted the patch to work in Max, and have added in a few extra features such as adjustable pitch and vibrato. It currently uses a basic synth patch as its sound source, but could easily be adapted to use audio recordings. This would then make it possible to add human vowel-like resonances to other sounds.

Download the patch here: 

As ever, you will need either a full version of Max/MSP 6 or the Max/MSP 6 runtime. Both are available from Cycling '74 here. The runtime version of Max allows you to run patches but not edit them.

The speech formants are modeled with [reson~], the resonant bandpass filter which is one of the standard MSP objects. As soon as you load the patch it should start making sound. You can see the frequencies of each formant as it cycles through the vowel sounds.

 Below is a chart detailing the individual frequencies. Note how they are not at harmonic intervals and do not have any regular spacing.

(Compiled by Tim Carmell, spectral database at Center for spoken Language Understanding, Oregon University)

These frequencies are defined by our anatomy, specifically the size and shape of the human supralaryngeal vocal tract (SVT). As we speak, the SVT continually changes shape to create the different formants needed for speech, producing a frequency pattern which changes over time. In the patch, we are using bandpass filters to physically model the resonant characteristics of this space. 

Interestingly, the specific anatomical traits necessary for human speech did not develop until the Paleolithic period (50'000 years ago), so both Neanderthals and earlier humans were physically incapable of what we consider human speech. We do not develop the ideal SVT dimensions until around 6-8 years old as the mouth shortens, the tongue changes shape and the neck lengthens during this time.

If we need to be scientifically accurate with our approach to designing creature sounds, we first need to ask some questions about the creature: 

  • Is it intelligent enough to speak?
  • Does it live in social groups, and therefore have a need for speech?
  • How will the anatomy of the creature affect the sounds it creates?
  • What is its native habitat, how will this affect its vocalisations?

So the formant synthesiser in this post addresses point three on that list, and covers part of a setup which could be used for creating creature vocalisations. There is also room for expanding the system; by changing the list of resonant frequencies this could model larger or smaller creatures (lower frequencies for larger creatures). 

On a side note, some VST users amongst you may have already encountered formant synthesis in what must be the most conceptually important plug-in ever created, the Delay Lama:

It doesn't get any better than that.


Friday, 12 April 2013

Tapehead - Sample Playback for Sound Designers

For this, the first post, we're going to look at a simple device which utilises the sample playback capabilities of Max/msp - essentially we are making a playback device. It's fairly basic, but can be expanded on in the future to create a more complex system (more on that later). I'm not going to cover how to re-create this device step-by-step, so this post assumes that you have a basic competence with both Max and msp. If you've gone through some of the tutorials or spent a bit of time noodling around with Max you should feel at home here.

In part, this was inspired by a story which stuck in my head about Frank Warner, sound editor on Raging Bull, Close Encounters and a whole host of other great films. Here is a section from it, part of an interview with Walter Murch in the book Soundscape: The School of Sound Lectures:

'He [Frank Warner] would take one of the reel-to-reel tapes from his huge library and put it on almost at random and move it with his fingers. He'd just move it manually, at various speeds, backwards and forwards across the playback head. Meanwhile , he'd have the recorder turned on, capturing all these random noises.... But he was in the darkness, letting the breeze blow through his brain, waiting for the kernal of an idea that would emerge, fragments of unique sounds on which he could build everything else'

(Murch, 1998)

Being able to play sounds back at different rates is one of the oldest, but still one of the most useful techniques for creative sound design. This device is designed to facilitate simple pitch manipulation in a way that is playful and experimental, embracing a bit of randomness and the unexpected along the way. The idea is to load up a recording, and experiment with just playing back the sample data at different rates and in different directions. There is no musical tuning, no measurements in semitones and cents, just the waveform, playback time and the playback pattern over that time. 

Here is the link to download the patch:

Tapehead 1.0 

You will need either a full version of Max/msp 6 or the Max/msp 6 runtime. Both are available from Cycling '74 here. The runtime version of Max allows you to run patches but not edit them.

(This patch is tested up to Max version 6.08, the current version 6.12 has issues with the replace message to [buffer~] so will not work, if you do have problems try an earlier version)

The best thing to do is load a sound and flick through the presets, try selecting different areas of the file, different playback times and shapes. With the breakpoint editor shift-click removes points and alt-click+drag adjust the curve if in curve mode. You can also drag the manual scrub bar at the top and scan over manually.

So this doesn't exactly break new ground, as this is all possible in most DAW's, but it does provide a convenient tool for experimentation. Also, within your DAW this is usually achieved through editing and off-line effects such as the pitch-bender. This player is capable of changing playback direction and position very quickly and specifically, and can control this using complex playback curves. The other key factor here is that as this process is live, there's no waiting for offline processing. 

I'm not going to explain how every single part of this patch works, but we are going to look at the main playback mechanism at its heart. Max has a range of different objects which can be used for sample playback, all which have slightly different attributes and capabilities. When I first started using Max I remember finding this quite confusing and overly complex, as sample playback is considered a really basic capability of any audio system. However, I soon learnt that with this complexity comes versatility, and that through this Max is capable of creating a range of sample driven instruments or playback systems. 

These are the objects associated with sample playback:


The first on the list, [sfplay~] is the odd one out here, as it plays back from disk. The others all play from an object called [buffer~] so the audio they use is stored in memory, like a sampler. 

With Max I often find that making a connection between two different objects is the inspiration for a device, and that's what happened here. I was tinkering with an object called [function] which is usually used for creating envelopes of different kinds and thought of a slightly unorthodox use for it; driving a [play~] object to playback samples in interesting ways.

Here is a simple patch below which demonstrates the core of this mechanism:

Here's a link to the patch itself:

Tapehead Basic

And here's a step by step rundown of what happens inside:

1. You load a sample into the [buffer~] called soundA  

2. This triggers [info~] to spit out some information about the sample we have stored in our [buffer~]. In this case we are interested in the total time of the sample, or how long it is, at the sample rate which it was recorded.

3. Moving over to the [function] object (the XY graph), we first set a duration which it will cover using the setdomain message. The message box here will add the text setdomain onto the beginning of any message which passes through its left inlet.

4. Trigger playback using the button at the top of the patch. This causes function to pass on information about the breakpoints you've created to [line~]

5. [Line~] generates a signal matching the shape and time which you set for the function. So a straight line from 0-1, left to right is linear playback, forwards. The opposite - a straight line from 1-0, left to right is linear playback, backwards. Between this you can set a playback shape which scans the wave in any way you see fit, backwards or forwards.

6. As the output from [line~] is between 0-1 we use [scale~] to scale the signal up to the length of our sample.

7. The signal then drives the [play~] object, playing back sample data from the [buffer~] and outputting the sound you have created through the [ezdac~].

I've expanded on the device further by adding the manual scrub option, as that can often be a good way of discovering new sounds, and adds more of a physical dimension to the process. I expect everyone who uses this in Protools has accidentally discovered a sound which is more interesting backwards than forwards in this way! The rest of the completed application is composed of UI objects (menus, sliders etc) and other control objects like [preset]. The beauty of this patch is the potential for expandability here. Now we have the main control mechanism in place we can duplicate it to add in other parameters. Multimode filter with envelope control over cutoff and resonance? envelope driven adjustable delay line? Amplitude envelope? Envelope controlled pitchshift? LFO controlled vibrato? Envelope controlled LFO speed? A bank of presets for each effect? Randomised preset recall? It's all possible.

Please feel free to comment. I'll also be expanding the system in a future post, so keep an eye out for that. 


This is a blog aimed at exploring Max as a tool for sound design, or more specifically; building tools for the purpose of sound design with Max.  

As sound designers we share many tools with musicians and composers, and rightly so... as it's all sound, after all. But there is also scope for designing custom tools for specific sound design tasks - that are not viable for mainstream developers. Perhaps tools like Max can bridge that gap?   

I've been using Max in this capacity for several years now and have built up a set of useful patches over that time. I've decided to properly document these and make them public. If anyone else feels like they would like to contribute to this, please do get in touch. I'm always interested to hear from other sound designers using Max, so please get in touch if you want to discuss or share ideas.