Real World Intensities and Amplitudes
There are many ways to describe a sound physically. One of the most common is the Sound Intensity Level (SIL). It describes the amount of power on a certain surface, so its unit is Watt per square meter ( " class="AM"> ). The range of human hearing is about " class="AM"> at the threshold of hearing to
" class="AM"> Sound Intensity Level in Bel
If, for instance, the ratio " class="AM">
For real world sounds, it makes sense to set the reference value " class="AM"> to the threshold of hearing which has been fixed as " class="AM"> at 1000 Hertz. So the range of hearing covers about 12 Bel. Usually 1 Bel is divided into 10 deci Bel, so the common formula for measuring a sound intensity is:
" class="AM"> Sound Intensity Level (SIL) in Decibel (dB) with " class="AM">
While the sound intensity level is useful to describe the way in which the human hearing works, the measurement of sound is more closely related to the sound pressure deviations. Sound waves compress and expand the air particles and by this they increase and decrease the localized air pressure. These deviations are measured and transformed by a microphone. So the question arises: what is the relationship between the sound pressure deviations and the sound intensity? The answer is: sound intensity changes " class="AM"> are proportional to the square of the sound pressure changes " class="AM"> . As a formula:
Let us take an example to see what this means. The sound pressure at the threshold of hearing can be fixed at " class="AM"> . This value is the reference value of the Sound Pressure Level (SPL). If we have now a value of " class="AM"> , the corresponding sound intensity relation can be calculated as:
Sound Pressure Level (SPL) in Decibel (dB) with " class="AM">
Working with Digital Audio basically means working with amplitudes. What we are dealing with microphones are amplitudes. Any audio file is a sequence of amplitudes. What you generate in Csound and write either to the DAC in realtime or to a sound file, are again nothing but a sequence of amplitudes. As amplitudes are directly related to the sound pressure deviations, all the relations between sound intensity and sound pressure can be transferred to relations between sound intensity and amplitudes:
" class="AM"> Relation between Intensity and Ampltitudes
" class="AM"> Decibel (dB) Scale of Amplitudes with any amplitude " class="AM"> related to an other amplitude " class="AM">
If you drive an oscillator with the amplitude 1, and another oscillator with the amplitude 0.5, and you want to know the difference in dB, you calculate:
So, the most useful thing to keep in mind is: when you double the amplitude, you get +6 dB; when you have half of the amplitude as before, you get -6 dB.
What is 0 dB?
As described in the last section, any dB scale - for intensities, pressures or amplitudes - is just a way to describe a relationship. To have any sort of quantitative measurement you will need to know the reference value referred to as "0 dB". For real world sounds, it makes sense to set this level to the threshold of hearing. This is done, as we saw, by setting the SIL to " class="AM"> and the SPL to " class="AM">
But for working with digital sound in the computer, this does not make any sense. What you will hear from the sound you produce in the computer, just depends on the amplification, the speakers, and so on. It has nothing, per se, to do with the level in your audio editor or in Csound. Nevertheless, there is a rational reference level for the amplitudes. In a digital system, there is a strict limit for the maximum number you can store as amplitude. This maximum possible level is called 0 dB.
Each program connects this maximum possible amplitude with a number. Usually it is '1' which is a good choice, because you know that everything above 1 is clipping, and you have a handy relation for lower values. But actually this value is nothing but a setting, and in Csound you are free to set it to any value you like via the 0dbfs opcode. Usually you should use this statement in the orchestra header:
0dbfs = 1
This means: "Set the level for zero dB as full scale to 1 as reference value." Note that because of historical reasons the default value in Csound is not 1 but 32768. So you must have this 0dbfs = 1 statement in your header if you want to set Csound to the value probably all other audio applications have.
dB Scale Versus Linear Amplitude
Let's see some practical consequences now of what we have discussed so far. One major point is: for getting smooth transitions between intensity levels you must not use a simple linear transition of the amplitudes, but a linear transition of the dB equivalent. The following example shows a linear rise of the amplitudes from 0 to 1, and then a linear rise of the dB's from -80 to 0 dB, both over 10 seconds.
<CsoundSynthesizer> <CsOptions> -odac </CsOptions> <CsInstruments> ;example by joachim heintz sr = 44100 ksmps = 32 nchnls = 2 0dbfs = 1 instr 1 ;linear amplitude rise kamp line 0, p3, 1 ;amp rise 0->1 asig oscils 1, 1000, 0 ;1000 Hz sine aout = asig * kamp outs aout, aout endin instr 2 ;linear rise of dB kdb line -80, p3, 0 ;dB rise -60 -> 0 asig oscils 1, 1000, 0 ;1000 Hz sine kamp = ampdb(kdb) ;transformation db -> amp aout = asig * kamp outs aout, aout endin </CsInstruments> <CsScore> i 1 0 10 i 2 11 10 </CsScore> </CsoundSynthesizer>
You will hear how fast the sound intensity increases at the first note with direct amplitude rise, and then stays nearly constant. At the second note you should hear a very smooth and constant increment of intensity.
Sound intensity depends on many factors. One of the most important is the effective mean of the amplitudes in a certain time span. This is called the Root Mean Square (RMS) value. To calculate it, you have (1) to calculate the squared amplitudes of number N samples. Then you (2) divide the result by N to calculate the mean of it. Finally (3) take the square root.
Let's see a simple example, and then have a look how getting the rms value works in Csound. Assumeing we have a sine wave which consists of 16 samples, we get these amplitudes:
These are the squared amplitudes:
The mean of these values is:
And the resulting RMS value is
The rms opcode in Csound calculates the RMS power in a certain time span, and smoothes the values in time according to the ihp parameter: the higher this value (the default is 10 Hz), the snappier the measurement, and vice versa. This opcode can be used to implement a self-regulating system, in which the rms opcode prevents the system from exploding. Each time the rms value exceeds a certain value, the amount of feedback is reduced. This is an example1 :
<CsoundSynthesizer> <CsOptions> -odac </CsOptions> <CsInstruments> ;example by Martin Neukom, adapted by Joachim Heintz sr = 44100 ksmps = 32 nchnls = 2 0dbfs = 1 giSine ftgen 0, 0, 2^10, 10, 1 ;table with a sine wave instr 1 a3 init 0 kamp linseg 0, 1.5, 0.2, 1.5, 0 ;envelope for initial input asnd poscil kamp, 440, giSine ;initial input if p4 == 1 then ;choose between two sines ... adel1 poscil 0.0523, 0.023, giSine adel2 poscil 0.073, 0.023, giSine,.5 else ;or a random movement for the delay lines adel1 randi 0.05, 0.1, 2 adel2 randi 0.08, 0.2, 2 endif a0 delayr 1 ;delay line of 1 second a1 deltapi adel1 + 0.1 ;first reading a2 deltapi adel2 + 0.1 ;second reading krms rms a3 ;rms measurement delayw asnd + exp(-krms) * a3 ;feedback depending on rms a3 reson -(a1+a2), 3000, 7000, 2 ;calculate a3 aout linen a1/3, 1, p3, 1 ;apply fade in and fade out outs aout, aout endin </CsInstruments> <CsScore> i 1 0 60 1 ;two sine movements of delay with feedback i 1 61 . 2 ;two random movements of delay with feedback </CsScore> </CsoundSynthesizer>
Human hearing is roughly in a range between 20 and 20000 Hz. But inside this range, the hearing is not equally sensitive. The most sensitive region is around 3000 Hz. If you come to the upper or lower border of the range, you need more intensity to perceive a sound as "equally loud".
These curves of equal loudness are mostly called "Fletcher-Munson Curves" because of the paper of H. Fletcher and W. A. Munson in 1933. They look like this:
Try the following test. In the first 5 seconds you will hear a tone of 3000 Hz. Adjust the level of your amplifier to the lowest possible point at which you still can hear the tone. - Then you hear a tone whose frequency starts at 20 Hertz and ends at 20000 Hertz, over 20 seconds. Try to move the fader or knob of your amplification exactly in a way that you still can hear anything, but as soft as possible. The movement of your fader should roughly be similar to the lowest Fletcher-Munson-Curve: starting relatively high, going down and down until 3000 Hertz, and then up again. (As always, this test depends on your speaker hardware. If your speaker do not provide proper lower frequencies, you will not hear anything in the bass region.)
<CsoundSynthesizer> <CsOptions> -odac </CsOptions> <CsInstruments> sr = 44100 ksmps = 32 nchnls = 2 0dbfs = 1 giSine ftgen 0, 0, 2^10, 10, 1 ;table with a sine wave instr 1 kfreq expseg p4, p3, p5 printk 1, kfreq ;prints the frequencies once a second asin poscil .2, kfreq, giSine aout linen asin, .01, p3, .01 outs aout, aout endin </CsInstruments> <CsScore> i 1 0 5 1000 1000 i 1 6 20 20 20000 </CsScore> </CsoundSynthesizer>
It is very important to bear in mind that the perceived loudness depends much on the frequencies. You must know that putting out a sine of 30 Hz with a certain amplitude is totally different from a sine of 3000 Hz with the same amplitude - the latter will sound much louder.
- cf Martin Neukom, Signale Systeme Klangsynthese, Zürich 2003, p. 383^