Csound

INTENSITIES

Real World Intensities and Amplitudes

There are many ways to describe a sound physically. One of the most common is the Sound Intensity Level (SIL). It describes the amount of power on a certain surface, so its unit is Watt per square meter ($\displaystyle\black{W}\//{{m}}^{{2}}$). The range of human hearing is about $\displaystyle\black{{10}}^{{-{{12}}}}{W}\//{{m}}^{{2}}$ at the threshold of hearing to $\displaystyle\black{{10}}^{{0}}{W}\//{{m}}^{{2}}$ at the threshold of pain. For ordering this immense range, and to facilitate the measurement of one sound intensity based upon its ratio with another, a logarithmic scale is used. The unit Bel describes the relation of one intensity $I$ to a reference intensity $I0$ as follows:

$\displaystyle\black{\log}_{{{10}}}\frac{{I}}{{I}_{{0}}}$   Sound Intensity Level in Bel

If, for instance, the ratio  $\displaystyle\black\frac{{I}}{{I}_{{0}}}$ is 10, this is 1 Bel. If the ratio is 100, this is 2 Bel.

For real world sounds, it makes sense to set the reference value $\displaystyle\black{I}_{{0}}$ to the threshold of hearing which has been fixed as $\displaystyle\black{{10}}^{{-{{12}}}}{W}\//{{m}}^{{2}}$ at 1000 Hertz. So the range of hearing covers about 12 Bel. Usually 1 Bel is divided into 10 deci Bel, so the common formula for measuring a sound intensity is:

$\displaystyle\black{10}\cdot{\log}_{{10}}\frac{{I}}{{I}_{{0}}}$   Sound Intensity Level (SIL) in Decibel (dB) with $\displaystyle\black{I}_{{0}}={{10}}^{{-{{12}}}}{W}\//{{m}}^{{2}}$

While the sound intensity level is useful to describe the way in which the human hearing works, the measurement of sound is more closely related to the sound pressure deviations. Sound waves compress and expand the air particles and by this they increase and decrease the localized air pressure. These deviations are measured and transformed by a microphone. So the question arises: what is the relationship between the sound pressure deviations and the sound intensity? The answer is: sound intensity changes $\displaystyle\black{I}$ are proportional to the square of the sound pressure changes $\displaystyle\black{P}$ . As a formula:

$\displaystyle\black{I}\approx{{P}}^{{2}}$   Relation between Sound Intensity and Sound Pressure

Let us take an example to see what this means. The sound pressure at the threshold of hearing can be fixed at $\displaystyle\black{2}\cdot{{10}}^{{-{{5}}}}{P}{a}$ . This value is the reference value of the Sound Pressure Level (SPL). If we have now a value of $\displaystyle\black{2}\cdot{{10}}^{{-{{4}}}}{P}{a}$ , the corresponding sound intensity relation can be calculated as:

$\displaystyle\black{{\left(\frac{{{2}\cdot{{10}}^{{4}}}}{{{2}\cdot{{10}}^{{5}}}}\right)}}^{{2}}={{10}}^{{2}}={100}$

So, a factor of 10 at the pressure relation yields a factor of 100 at the intensity relation. In general, the dB scale for the pressure $P$ related to the pressure $P0$ is:

$\displaystyle\black{10}\cdot{\log}_{{10}}{{\left(\frac{{P}}{{P}_{{0}}}\right)}}^{{2}}={2}\cdot{10}\cdot{\log}_{{10}}\frac{{P}}{{P}_{{0}}}={20}\cdot{\log}_{{10}}\frac{{P}}{{P}_{{0}}}$

Sound Pressure Level (SPL) in Decibel (dB) with $\displaystyle\black{P}_{{0}}={2}\cdot{{10}}^{{-{{5}}}}{P}{a}$

Working with Digital Audio basically means working with amplitudes. What we are dealing with microphones are amplitudes. Any audio file is a sequence of amplitudes. What you generate in Csound and write either to the DAC in realtime or to a sound file, are again nothing but a sequence of amplitudes. As amplitudes are directly related to the sound pressure deviations, all the relations between sound intensity and sound pressure can be transferred to relations between sound intensity and amplitudes:

$\displaystyle\black{I}\approx{{A}}^{{2}}$   Relation between Intensity and Ampltitudes

$\displaystyle\black{20}\cdot{\log}_{{10}}\frac{{A}}{{A}_{{0}}}$   Decibel (dB) Scale of Amplitudes with any amplitude $\displaystyle\black{A}$ related to an other amplitude $\displaystyle\black{A}_{{0}}$

If you drive an oscillator with the amplitude 1, and another oscillator with the amplitude 0.5, and you want to know the difference in dB, you calculate:

$\displaystyle\black{20}\cdot{\log}_{{10}}\frac{{1}}{{0.5}}={20}\cdot{\log}_{{10}}{2}={20}\cdot{0.30103}={6.0206}{d}{B}$

So, the most useful thing to keep in mind is: when you double the amplitude, you get +6 dB; when you have half of the amplitude as before, you get -6 dB.

What is 0 dB?

As described in the last section, any dB scale - for intensities, pressures or amplitudes - is just a way to describe a relationship. To have any sort of quantitative measurement you will need to know the reference value referred to as "0 dB". For real world sounds, it makes sense to set this level to the threshold of hearing. This is done, as we saw, by setting the SIL to $\displaystyle\black{{10}}^{{-{{12}}}}{W}\//{{m}}^{{2}}$and the SPL to $\displaystyle\black{2}\cdot{{10}}^{{-{{5}}}}{P}{a}$.

But for working with digital sound in the computer, this does not make any sense. What you will hear from the sound you produce in the computer, just depends on the amplification, the speakers, and so on. It has nothing, per se, to do with the level in your audio editor or in Csound. Nevertheless, there is a rational reference level for the amplitudes. In a digital system, there is a strict limit for the maximum number you can store as amplitude. This maximum possible level is called 0 dB.

Each program connects this maximum possible amplitude with a number. Usually it is '1' which is a good choice, because you know that everything above 1 is clipping, and you have a handy relation for lower values. But actually this value is nothing but a setting, and in Csound you are free to set it to any value you like via the 0dbfs opcode. Usually you should use this statement in the orchestra header:

0dbfs = 1

This means: "Set the level for zero dB as full scale to 1 as reference value." Note that because of historical reasons the default value in Csound is not 1 but 32768. So you must have this 0dbfs=1 statement in your header if you want to set Csound to the value probably all other audio applications have.

dB Scale Versus Linear Amplitude

Let's see some practical consequences now of what we have discussed so far. One major point is: for getting smooth transitions between intensity levels you must not use a simple linear transition of the amplitudes, but a linear transition of the dB equivalent. The following example shows a linear rise of the amplitudes from 0 to 1, and then a linear rise of the dB's from -80 to 0 dB, both over 10 seconds.

EXAMPLE 01C01_db_vs_linear.csd

<CsoundSynthesizer>
<CsOptions>
-odac
</CsOptions>
<CsInstruments>
;example by joachim heintz
sr = 44100
ksmps = 32
nchnls = 2
0dbfs = 1

instr 1 ;linear amplitude rise
kamp      line    0, p3, 1 ;amp rise 0->1
asig      oscils  1, 1000, 0 ;1000 Hz sine
aout      =       asig * kamp
outs    aout, aout
endin

instr 2 ;linear rise of dB
kdb       line    -80, p3, 0 ;dB rise -60 -> 0
asig      oscils  1, 1000, 0 ;1000 Hz sine
kamp      =       ampdb(kdb) ;transformation db -> amp
aout      =       asig * kamp
outs    aout, aout
endin

</CsInstruments>
<CsScore>
i 1 0 10
i 2 11 10
</CsScore>
</CsoundSynthesizer>

You will hear how fast the sound intensity increases at the first note with direct amplitude rise, and then stays nearly constant. At the second note you should hear a very smooth and constant increment of intensity.

RMS Measurement

Sound intensity depends on many factors. One of the most important is the effective mean of the amplitudes in a certain time span. This is called the Root Mean Square (RMS) value. To calculate it, you have (1) to calculate the squared amplitudes of number N samples. Then you (2) divide the result by N to calculate the mean of it. Finally (3) take the square root.

Let's see a simple example, and then have a look how getting the rms value works in Csound. Assumeing we have a sine wave which consists of 16 samples, we get these amplitudes:

These are the squared amplitudes:

The mean of these values is:

$(0+0.146+0.5+0.854+1+0.854+0.5+0.146+0+0.146+0.5+0.854+1+0.854+0.5+0.146)/16=8/16=0.5$

And the resulting RMS value is $0.5=0.707$

The rms opcode in Csound calculates the RMS power in a certain time span, and smoothes the values in time according to the ihp parameter: the higher this value (the default is 10 Hz), the snappier the measurement, and vice versa. This opcode can be used to implement a self-regulating system, in which the rms opcode prevents the system from exploding. Each time the rms value exceeds a certain value, the amount of feedback is reduced. This is an example1 :

EXAMPLE 01C02_rms_feedback_system.csd

<CsoundSynthesizer>
<CsOptions>
-odac
</CsOptions>
<CsInstruments>
;example by Martin Neukom, adapted by Joachim Heintz
sr = 44100
ksmps = 32
nchnls = 2
0dbfs = 1

giSine    ftgen     0, 0, 2^10, 10, 1 ;table with a sine wave

instr 1
a3        init      0
kamp      linseg    0, 1.5, 0.2, 1.5, 0 ;envelope for initial input
asnd      poscil    kamp, 440, giSine ;initial input
if p4 == 1 then ;choose between two sines ...
else ;or a random movement for the delay lines
endif
a0        delayr    1 ;delay line of 1 second
krms      rms       a3 ;rms measurement
delayw    asnd + exp(-krms) * a3 ;feedback depending on rms
a3        reson     -(a1+a2), 3000, 7000, 2 ;calculate a3
aout      linen     a1/3, 1, p3, 1 ;apply fade in and fade out
outs      aout, aout
endin
</CsInstruments>
<CsScore>
i 1 0 60 1 ;two sine movements of delay with feedback
i 1 61 . 2 ;two random movements of delay with feedback
</CsScore>
</CsoundSynthesizer>

Fletcher-Munson Curves

Human hearing is roughly in a range between 20 and 20000 Hz. But inside this range, the hearing is not equally sensitive. The most sensitive region is around 3000 Hz. If you come to the upper or lower border of the range, you need more intensity to perceive a sound as "equally loud".

These curves of equal loudness are mostly called "Fletcher-Munson Curves" because of the paper of H. Fletcher and W. A. Munson in 1933. They look like this:

Try the following test. In the first 5 seconds you will hear a tone of 3000 Hz. Adjust the level of your amplifier to the lowest possible point at which you still can hear the tone. - Then you hear a tone whose frequency starts at 20 Hertz and ends at 20000 Hertz, over 20 seconds. Try to move the fader or knob of your amplification exactly in a way that you still can hear anything, but as soft as possible. The movement of your fader should roughly be similar to the lowest Fletcher-Munson-Curve: starting relatively high, going down and down until 3000 Hertz, and then up again. (As always, this test depends on your speaker hardware. If your speaker do not provide proper lower frequencies, you will not hear anything in the bass region.)

EXAMPLE 01C03_FletcherMunson.csd

<CsoundSynthesizer>
<CsOptions>
-odac
</CsOptions>
<CsInstruments>
sr = 44100
ksmps = 32
nchnls = 2
0dbfs = 1

giSine    ftgen     0, 0, 2^10, 10, 1 ;table with a sine wave

instr 1
kfreq     expseg    p4, p3, p5
printk    1, kfreq ;prints the frequencies once a second
asin      poscil    .2, kfreq, giSine
aout      linen     asin, .01, p3, .01
outs      aout, aout
endin
</CsInstruments>
<CsScore>
i 1 0 5 1000 1000
i 1 6 20 20  20000
</CsScore>
</CsoundSynthesizer>

It is very important to bear in mind that the perceived loudness depends much on the frequencies. You must know that putting out a sine of 30 Hz with a certain amplitude is totally different from a sine of 3000 Hz with the same amplitude - the latter will sound much louder.

1. cf Martin Neukom, Signale Systeme Klangsynthese, Zürich 2003, p. 383^