Chapter 2 The nature of sound(1 / 1)

Two faces of sound

If a tree falls in a forest without anyone to hear it, does it make a sound? The dual meaning of ‘sound’, as physical phenomenon and sensation, provides a clear answer: yes and no. The relationships between the physical and sensual aspects of sound are complex, in that many of the impressions sound makes on us are related to its physical parameters but not reducible to them. So: high-frequency sounds sound higher pitched—usually. And more powerful sounds sound louder—on the whole. Furthermore, many sounds, from sirens to skirls and from lullabies to lions’ roars, make emotional impacts on us which have only the vaguest relationships to their physical parameters.

The physical aspects of sound are far better understood than the emotional ones, so it is with physics that we should begin.

Pressure waves

Sounds are usually made by something moving in a cyclic manner: the diaphragm of a loudspeaker pulsing in and out, the gap between the vocal folds narrowing and widening, or a guitar string vibrating back and forth. It is the transmission of these motions to the surrounding medium (solid, liquid, or gas), and their progression through that medium, that constitute sound. In some cases the motion begins in the medium itself, such as the air in the neck of a bottle when one blows across it. Non-moving sources include sudden releases of heat energy, such as by explosions or sparks, and rapidly oscillating heat sources.

When the motion is that of a loudspeaker’s diaphragm, the cause is a varying electrical signal with the same pattern as the sound wave that the diaphragm will produce. Each time the diaphragm moves out, it squeezes the air molecules immediately in front of it closer together, forming a region of high pressure. These molecules press on their neighbours, moving them closer together in their turn, and so a pulse of close-together molecules (a compression) moves through the medium, followed by a low-pressure area (rarefaction), which is produced as the diaphragm moves inwards.

The diaphragm then moves out again, making a second pulse. How often the diaphragm moves in and out during 1 second gives the frequency of the sound wave (in hertz, abbreviated Hz). The simplest sound wave is a pure tone as made, for example, by a tuning fork; a snapshot of the variation in air pressure with distance from such a fork would be a sine wave, as shown in Figure 1.

The distance between adjacent peaks (or troughs) of a sound wave gives the wavelength (λ). The sound will travel through the air at a velocity v, which will be around 340 metres per second at room temperature. The frequency (f) is given by the equation f=v/λ.A plot of the variation of pressure over time at a single point in space is also a sine wave, so we could label the above x-axis of Figure 1 as ‘time’ if we wished.

1. Sound wave pressure plot.

2. Molecular view of sound wave.

Images like Figure 1 are so commonplace that it is easy to imagine they provide some kind of a picture of a sound wave, and many books use them in this way. In fact, however, there is no up and down (transverse) motion in a sound wave as there is in, say, an ocean wave—the only motion is of molecules shuttling alternately away from and towards the source, like the balls of a Newton’s cradle. Such waves are referred to as longitudinal, and if we could see air molecules they would look something like Figure 2.

If a continuous sound originates from a point then it spreads in all directions, as an expanding sphere. If the detection area is small (like a microphone diaphragm or eardrum) and is several metres from the source, the curvature of the sound sphere is negligible, in which case the sound arrives in the form of plane waves. Even if the source of the sound has a particular direction to it (like most loudspeakers), the sound will still spread spherically, so long as the diaphragm is larger than the wavelength of the sound. For shorter wavelengths, the sound retains its original direction to some extent, and at suficiently high frequencies will form a beam (see Chapter 6).

Carrying sound

The velocity of sound depends only on the elasticity and density of the medium (Chapter 1). In air, sound velocity increases with increasing humidity and decreases with increasing temperature, but only because of the changes in density that these factors cause. Some examples are given in Table 1.

Table 1. Sound velocities in difierent media and conditions

Because the velocity of sound in air increases with temperature, on days when the air many metres up is hotter than that near the ground, sound waves travel faster the higher they are. The effect of this velocity increase is to bend (refract) them downwards from the warmer air, returning to Earth some distance away, as shown in Figure 3. Sometimes sounds can be heard more clearly at a great distance than at a short one due to this effect. Refraction also explains why it is hard to hear against the wind: the wind near the ground slows the sound waves slightly (compared to the ground) but the air a few metres up is faster, so the sound waves are slowed a little more there. The sound refracts from the low-velocity region to the higher-velocity one, and hence curves up away from the ground and from your ears (see Figure 4).

3. Sound propagation when air is cooler near the ground than high above it.

4. Barbara can hear Alan, but not Clive.

No matter what one does with a diaphragm, one cannot push sound through the air that surrounds it any faster. More rapid waggling generates pressure pulses that are closer together, which then arrive at a point—your eardrum, say—more frequently. That is to say, the sound frequency would rise. If one tries pushing the air harder by moving the diaphragm further in and out, then the amount of compression (and rarefaction) in the pulses increases, leading to a higher sound pressure (heard as a louder sound). If one forces the diaphragm to move faster than the velocity of sound in the medium, a pulse has no time to move away from the diaphragm before the next pulse forms. Hence, they pile up into a single, extreme high-pressure pulse known as a shock wave (the cause of sonic booms and whip cracks).

Moving a diaphragm more rapidly is not the only way to increase the frequency of a sound: if the loudspeaker (or other source) is rapidly approaching you or you are rapidly approaching it, the pressure pulses arrive at your ear more frequently, because each starts closer to you than the one before, so the frequency of the sound rises. Once the source has passed by, the pulses hit your ear with longer intervals in between, because each has a little farther to travel than the one before. So the frequency falls. This is the well-known Doppler effect, which happens whenever one is passed by a speeding motorbike or the siren of the police car behind it (Box 2).

Like light, sound reflects, and as with reflection from a mirror, an image of the sound source is formed if a surface is smooth and hard, so if you are somewhere between source and surface, you will hear approximately the same sound from each side (but quieter from the reflector side). ‘Smooth’ is a relative term though, meaning ‘with bumps smaller than the wavelength’. Since sound waves are about one million times longer than light (comparing3 kHz to yellow), even quite rough surfaces like concrete make good acoustic mirrors. Concave acoustic mirrors focus the sounds that they reflect: in World War I just such concave concrete sound mirrors were built along the south coast of England to focus the sound of approaching planes into the ears of listening solders. When sound echoes between two or more curved reflectors, the result can be a whispering gallery, like the one in London’s Saint Paul’s Cathedral.

Box 2

Sound will reflect from the interface between any two media, whether air and concrete, water and air, or different rock layers in the Earth. How much of the sound is reflected depends on the difference in the acoustic impedances of the two media, and the impedance in turn depends on the density of the medium and the velocity of sound in it. Acoustic impedance (Box 3) is similar to electrical resistance in that it measures the dificulty with which sound can travel through a medium. It is key to many of the effects and applications of sound. For instance, a soft rubbery surface will absorb sound and convert it to heat, since soft rubber has an extremely high acoustic impedance. Stealth coatings on submarines are based on this fact, but unfortunately the softness of rubber is very temperature-dependent, so, when Cold War submarines were redeployed from the North Atlantic to the Gulf from the late 1980s, the higher water temperatures robbed them of their stealth and set off a flurry of research and resurfacing.

Sound can be focussed by passing it through an acoustic lens, often made of acrylic plastic. Lenses work because a wave is refracted when it passes from one medium to another, so long as it strikes the interface between the media at an angle. The angle through which the wave is refracted depends on the ratio of its velocities in the two media (Snell’s law, Box 4).

One effect which is usually far more noticeable for sound than for light is its ability to bend round corners and over walls, and to spread out after passing through an opening, a phenomenon known as diffraction or scattering (Figure 5).

Box 3

Characteristic acoustic impedance of a medium: Z0=ρ0v0; density ρ0, velocity of sound v0. The unit is the Rayl, and the 0s indicate that these are the values of the medium when it is ‘unperturbed’—that is, when no sound is present in it.

Box 4

Snell’s law: sinθ1/sinθ2=v1/v2; θ(theta) is the angle between the direction of the sound and a line at right angles to the interface, v is sound velocity.

5. Difiraction.

The longer the wavelength, the greater the bending. So if a high wall is introduced between a sound source—a band, say—and a listener, the bass sounds diffract from its top back down to earth but the high-pitched ones are lost (Figure 6). This muffling effect is a useful clue that helps us gauge the distances of familiar sound sources outdoors.

When light falls on a series of parallel lines, stripes, or ridges a single wavelength apart (or thereabouts), it is diffracted, and since shorter wavelengths are diffracted through larger angles, such difiraction gratings split white light into its component colours—the back of a CD makes rainbows from sunbeams in just this way. Since a pure tone is a regular series of ‘stripes’ of increased pressure, it can also act as a diffraction grating, scattering light with a wavelength around that of the distance between the stripes (that distance being half the wavelength of the sound). Usually the medium here is a crystalline solid, such as fused quartz. This acousto-optic effect, where sound waves scatter light, is used both underwater and in air as a non-perturbing measurement and imaging tool (see Figure 7).

6. Difiraction of difierent wavelengths.

When sounds from multiple sources meet, mix, and mingle, the result is a three-dimensional pattern of loud and quiet areas called an interference pattern. The quiet areas form at the points where rarefactions from one source meet compressions from another (destructive interference), and the loud ones arise when rarefactions meet rarefactions, or compressions meet compressions (constructive interference) (Figure 8).

Interference is important in stereo sound production and in noise cancellation, and it introduces one more parameter that characterizes a sound wave: its phase, that is, how high or low its pressure is at a particular point in space and time. Phase really only matters when sound waves interact: in the above example,

7. The acousto-optic efiect.

8. Constructive and destructive interference.

pairs of sound waves whose compressions coincide with each other (and hence make a loud area) are in phase, while those in which they do not coincide are out of phase. When waves are maximally out of phase, they are said to be in antiphase. Our hearing systems cannot detect phase.

The power of sound

There are several ways in which the amount of sound can be defined and measured, and each way is appropriate for different applications. If hearing or music is the context, sound pressure is the obvious choice since it is the parameter which relates most directly (though none too simply: read on!) to the impression of loudness. But in discussing the eficiency of a sound source one may wish to know how much energy is flowing from it in a second—the sound power. To describe the effects of a particular sound field on an object, the parameter of interest is the sound intensity, which is the amount of sound energy striking 1 square metre of that object each second. Volume is an ill-defined measure used to label audio equipment, but intended to mimic loudness.

Audible-frequency sound waves lose very little energy through absorption by the air through which they pass (around 0.25 dB/6 per cent per 100 metres, though varying greatly with weather conditions). The main reason sounds die with distance is that they are free to spread out in many directions, so their energies spread progressively more and more thinly to occupy larger and larger volumes. If a sound source is suspended in free air so that its sound can spread in every direction (spherical spreading), then the sound pressure is inversely proportional to the distance of the listener from the source: that is, if the distance from source to measurement point doubles, the sound pressure halves.

The intensity of the sound falls more rapidly than this: it is inversely proportional to the square of the distance. So, if the distance from source to measurement point doubles, the sound intensity falls to one-quarter (1/22). If the distance is multiplied by 10, the intensity falls to one-hundredth (1/102). If the sound source is on the ground, the waves spread hemispherically (Box 5), and sound pressure and intensity fall at half the above rates: in other words the intensity falls to one-half if the distance doubles—roughly,

Box 5

Spherical spreading: I=P/4pr2; intensity I, power P, distance from the source r.

unless the ground is a perfect reflector (a marble floor is pretty near), intensity will fall faster than this, due to loss of energy due to absorption. The sound power depends only on the source, so it is the same at any distance.

Pure tones are never found in nature, but are perhaps most closely approximated by the songs of birds. The waveforms of real sounds look very different: the pressure-plots of different sounds of similar fundamental frequency are shown in Figure 9.

The difficult decibel

Sound was one of the first forms of energy to be understood: as far back as 300 bce it was known to be some kind of pattern of physical changes that could travel through air and water, but it was long after that that the most obvious characteristic of sound—its loudness—was quantified in any way. What did turn up took over 2,000 years to arrive, and was not very satisfactory when it did.

By far the most widely used way to quantify the amount of sound is the decibel (dB, Box 6): if two signals differ in sound pressure by 1 dB, the ratio of those pressures is about 1.2:1. (Handily, this happens to be about the smallest difference we can hear under ideal conditions.) A 10 dB difference corresponds to a ratio of about 3:1, and a 100 dB difference to a ratio of 100,000:1.

A decibel is one-tenth of a bel, a name made by combining three letters commonly used in transmission theory ((β, ε,and l) with a tip of the hat to Alexander Graham Bell. Decibels aren’t units—they are ratios, so they can describe how much more powerful one thing is than another: you could, if you wished, use them to compare the outputs of a pair of heaters. But that would not tell you anything about how hot either of them actually are.

9. Wave shapes.

Box 6

Decibels: the sound power difference in decibels between sounds of powers P1 and P0 is 10 log10 (P1/P0). Power is proportional to pressure squared, so the sound pressure difference between sound pressures p1 and p2 is 10 log10 (p12/p22), which is20 log10(p1/p2).

To describe the sound of a device in decibels, it is vital to know what you are comparing it with. For airborne sound, the comparison is with a sound that is just hearable (corresponding to a pressure of twenty micropascals). When the amount of sound is given in terms of such a reference level, the word ‘level’ is appended: hence sound pressure level (SPL), for example.So, a sound of 0 dB is ‘one times louder than’ (i.e. equally loud as) a sound you can just about hear, 1 dB is about 1:12 times as loud, 2 dB is 1:26 times, and so on. Are all acousticians happy with this solution? No, they are not. Ultrasound engineers don’t care how much ‘louder than you can just about hear’ their ultrasound is, because no one can hear it in the first place. It’s power they like, and it’s watts they measure it in. Meanwhile, underwater acousticians rightly ask: ‘The threshold of hearing? What does that mean when your ears are full of water and you’ve got a rubber headpiece on? Or if you’re a whale?’ So they base their decibels on a reference pressure of one micropascal, because that’s nice and easy to remember. So now we have two kinds of decibel, one for use in water and one for air, which will give different values for the same sound. Not too much of a problem so long as everyone always remembers to say what the reference level of the decibel they are using is. Sadly however, they don’t.

There is another problem. Few of us care how much sound an object produces—what we want to know is how loud it will sound. And that depends on how far away the thing is. This may seem obvious, but it means that we can’t ever say that the SPL of a car horn is 90 dB, only that it has that value at some stated distance. Often, even those handy decibel charts so popular in textbooks get this bit wrong, and claim the SPL of a pneumatic drill is 100 dB, when they really mean ‘100 dB if measured at a distance of 10 (or however many) metres’. It’s not dificult to see where this laziness creeps in—the charts usually also have examples like ‘a quiet ofice’, and we understand that the chart maker is referring to a quiet ofice that you are working in, not a quiet ofice down the corridor or in another town.

There is a third problem: a source of sound might make sound waves at any one frequency, at a handful or at a wide variety of frequencies. Let’s assume for the moment that the source of the sound is a loudspeaker so eficient that it always converts all the electrical energy fed into it into sound. And let’s imagine it has a frequency control but no volume knob. If we measured the total sound energy flowing from that loudspeaker each second (that is, the power) while changing its frequency, that power would of course remain constant. Similarly, the SPL at a particular distance from the loudspeaker would stay the same—as a microphone would show (assuming it were equally sensitive at all frequencies).

However, this is nothing like what your ears would tell you. If the loudspeaker was just audible at 20 Hz, it would increase in loudness as the frequency rose, until at around 4 kHz it would sound (very roughly) 200 times louder. At higher frequencies still, it would get quieter again, finally fading into inaudibility at somewhere between 8 kHz and 20 kHz, depending how old you are and what you’ve been doing to your ears for the past few decades.

In practice, acousticians weight the response of the circuit of which the microphone forms a part, so that the system behaves like the ear—being most sensitive to frequencies at around 4 kHz. A frequency-weighted microphone is the heart of a sound level meter (SLM). There is actually a wide choice of different weightings, including some for dogs, but the most popular by far is A-weighting, which approximates the response of a human ear at moderate sound levels. Hence, the decibels that matter to us are usually A-weighted, written dBA, the full name for which is ‘A-level weighted sound pressure level in decibels’.

SLMs are equipped with a choice of time integration factors. This matters because if a sound lasts less than about 0.1 second it sounds quieter, since the hearing system adds up the energy for about this period.

To add to the complexity, the loudness of a sound also depends on the nature of its source. For instance, people so dislike the sound of planes that, on average, they consider them to be as annoying as anonymous sounds that are about 5 dB louder. Conversely, people are rather fond of train noise, to the extent that they only find it as annoying as anonymous sounds which are about 5 dB quieter. These reactions are so well established that many planning applications which involve aircraft or railway noise adjust their figures by 5 dB (the corrections are known as the aircraft malus and the railway bonus). This means that no meter can actually measure what architects, home owners, noise campaigners, noisy machine buyers, and acousticians really need to know: how loud a sound is.

Considering all of this, there is little point in measuring SPLs with great accuracy: most SLMs are accurate to +/–1.4 dB at10 kHz (called Class 2 meters). Even in laboratory work, +/–1.1 dB at 10 kHz is almost always enough (as provided by a Class 1 SLM). Far more important than accuracy is adherence to standard measurement procedures, including the frequent calibration of SLMs by comparison with standard measurement microphones.

Despite the complexities of loudness and its variation according to the source and the user, extensive surveys of the reactions of large numbers of individuals to carefully chosen sounds have determined roughly how loudness relates to SPL, and units have been defined on this basis, in particular the phon. Phons are defined as having the same values as the SPLs of 1 kHz tones, so a 1 kHz tone with an SPL of 10 dB has a loudness level of ten phons. But a 50 Hz tone with that same loudness level of ten phons has an SPL of 73 dB, because our ears are so much less sensitive at 50 Hz than 1 kHz that a 50 Hz tone needs to be 63 dB higher than a 1 kHz tone to sound as loud.

Loudness is just one of a large set of psychoacoustic measures, also known as sound quality parameters (‘quality’ being used in the sense of ‘character’ rather than ‘goodness’). Loudness is by far the most commonly used and best developed; the others include sharpness (in acums), roughness (aspers), fluctuation (vacils), and dieselness (which has no units: different automobiles are simply ranked subjectively according to how ‘dieselly’ people think they sound). As the last-named suggests, these measures were developed primarily by the automotive industry in its attempts to make door-clunks, engine sounds, and even indicator noises sound appropriately powerful, masculine, reliable, and so on. In principle it would be very useful if domestic products and other noise sources could be characterized by such parameters.

The topic of sound qualities is part of the discipline of psychoacoustics, the study of the psychological effects of sound, which itself can be considered as an element of what is now known as sound studies. Sound studies deal with how sounds of all kinds have been made and consumed throughout history and in different cultures. Work on such topics has been carried out since the 1940s, and has increased greatly since the early 1990s.

Standing waves

A recurring aim in the history of acoustics is to make sound visible. In the 1780s, Ernst Chladni studied the ways in which metal plates vibrate when they are made to ‘sing’ by being stroked with a violin bow. Fine powder sprinkled on the plates is deflected from areas where vibration is strong, and collects in those that are still. The powder-free, strongly vibrating points correspond to antinodes (like the peaks or troughs in Figure 1), and the stationary, powdery areas are the nodes, the points where there is no pressure change (where the line crosses the axis in Figure 1).

It was possible for Chladni to ‘see’ sound waves in this way only because they did not progress through space: they were stationary, or ‘standing’ waves. For a standing wave, Figure 1 represents only how the pressure of the wave changes with location, and not the way the pressure at a particular spot changes over time (such a time-based diagram for any point in a standing wave would be a horizontal line).

The principle is clearer when considering the standing sound waves that are formed when one blows across the open end of a12 cm tube which is closed at the other end. In all such waves, the air at the closed end of the tube cannot move, due to friction with the end wall (so this point is a node). The simplest wave of this kind is one in which the motion of the air molecules increases with distance from this end, and reaches a maximum (an antinode) at the open end. In this wave, one-quarter of a wavelength is inside the tube, so it has a wavelength of 4 × 12 = 48 cm. If one blows hard enough, a whole range of other standing waves will form, each with a node at one end of the tube and an antinode at the other, as shown in Figure 10. The lengths of these other waves are simple multiples of the first, and such waves are known as harmonics.

10. Standing waves in a pipe open at one end.

Just as in such a pipe, so in any other fluid-filled cavity, or any rigid object, there are certain wavelengths of sound which are particularly easy to excite. These are called resonance modes (or simply resonances), and the main ones can be predicted since they depend only on dimensions. For instance, if a 12 cm rod is fixed at its ends and struck sharply, it will generate 24 cm sound waves, together with waves of lengths 12 cm, 8 cm, 6 cm, 4 cm, and all others which include nodes 24 cm apart—again, a set of harmonics.

A 12 cm long box of air or water will produce all these waves too—in this case, what stops the ‘ends’ of the fluid moving is that they are adjacent to the box walls, where friction prevents free motion. The box will also produce families of waves corresponding to its height, width, and diagonals.

Resonances can be a major problem in room acoustics, but are the basis of most musical instruments. With an instrument that includes tubes open at one end (like some organ pipes), the open end is an antinode, and hence the fundamental frequency has twice the wavelength of that of a closed tube of the same length. (Actually, the antinode forms just beyond the pipe’s end, requiring an end correction to be made, see Box 7.)

Box 7

Usually, the lowest resonant frequency is the most powerful; however, if a great deal of energy is supplied to an instrument, it may resonate an octave—or even two—higher. A flute, for example, will do this if it is blown hard enough (‘overblown’).

Resonances are all around us—tap a plate, glass, or fork and it will ring, provided only that it is not damped by being held too tightly (tuning forks still resonate if held tightly because they have two identical prongs that move in opposite directions, cancelling each other out at the handle so there is no resultant motion there). This is a handy way of finding whether crockery is cracked: if all is well, each successive millimetre of the plate will move immediately after its neighbouring millimetre, allowing waves to pass, much like a Mexican wave: the plate is, literally, sound. But if even a very line crack separates adjacent areas, dragging and friction damp the resonance, giving rise to an unhealthy ‘clink’.

If the force supplied to an object is a sound at the resonant frequency, the coupling will be highly eficient, hence guitar strings that sound in sympathy with those struck across the room, or bits of television sets that buzz annoyingly along with dramatic programme sounds.

An effect which is of importance in several areas of acoustics is Helmholtz resonance, familiar to anyone who has blown across the top of a bottle to make it sing. Any hollow object or cavity with an open neck will act as a Helmholtz resonator (Box 8). If a stream of air is blown across the opening, some will enter the neck, increasing the pressure in the cavity a little. This overpressure pushes the air out again—and, just like a pendulum,

this air ‘overshoots’ a bit, leaving a slight underpressure, which sucks more air in, and so on. This regular cycling constitutes a sound wave at a resonant frequency. If a sound wave at this frequency is supplied to the resonator, it will sound very strongly.

Box 8

Charting sounds

Standing waves are a small subset of sound waves: mostly, the regions of high and low pressure in a wave move through space (such waves are called progressive or travelling waves). If one wishes to ‘see’ a travelling wave, one must therefore chart air pressure changes through time. One of the first to attempt this was Alexander Graham Bell, who in 1874 procured an ear from a corpse, impregnated it with oil to keep it flexible, and attached a thin straw to its drum. The other end of the straw was allowed to trace a line on a strip of soot-covered glass which was moved along as the ear was shouted at. This wobbly line was the first recording of a sound wave and the device was called an ear phonautograph. To the relief of those who had to construct them, later versions dispensed with dead ears in favour of metal diaphragms.

Phonautographs were no good for making actual measurements of sound waves, however; these were eventually provided by the cathode ray oscilloscope (CRO), developed in the 1930s. CROs can be set with different time-bases so that a high-frequency sound can be spread out across the screen or a low frequency one compressed, so that their wave shapes can be seen. From this, their wavelengths can be read off and their frequencies determined.

11. Spectrogram.

Today, computerized versions of CROs are widely used. However, a two-dimensional plot can still only display some features of sound. Most sound waves vary rapidly both in frequency content and in pressure, which can only be properly displayed together on a three-dimensional display, called a spectrogram, which cannot be produced without a computer. In a spectrogram, height up the screen usually represents frequency, and brightness or colour represents sound pressure (or intensity).In other cases, a representation of a three-dimensional shape may be made on a screen, the results often resembling mountain ranges (Figure 11).

Unweaving sounds

Being able to see a sound allows one to find out a lot about it qualitatively, and rough measurements can also be made of the screen outputs, but often good quantitative information about sound is needed (perhaps to eliminate noise or improve the design of a musical instrument). For this, a mathematical analysis is required, and the most widely used and fundamental is based on work conducted by Joseph Fourier in the 1800s.

Fourier realized that any periodic function (that is, one which repeats at a steady rate) can be constructed by adding together a series (now called a Fourier series) of sine waves—and he worked out a method to determine what the members (terms) of that series are. (Mathematically speaking a Fourier series is composed of a series of sines and cosines—but a cosine is simply a sine wave which starts at maximum, rather than at zero, so I’ll just refer to sine waves here.) As Figure 12 shows, as few as three sine waves can roughly approximate a square wave.

To make the sides of the latter more vertical, higher-frequency tones must be added. A square wave sounds like a click, and Fourier analysis shows therefore that a sudden (that is, rapidly increasing in level) click will include some very high-frequency components.

12. Summing sine waves to approximate a square wave.

Fourier’s original work was only applicable to periodic waves, but a development of it known as the Fourier transform can handle non-periodic ones. A highly eficient mathematical method of calculating the component sine waves of a signal is known as a Fast Fourier Transform (FFT). When adding waves like this, their phase must be taken into account. During a single wavelength the sound pressure of a wave rises from zero (that is, equal to the ambient air pressure) to maximum, then falls down to minimum, and then rises to zero again. This is similar to the vertical motion of a dot painted on the edge of a rotating wheel, so phase can be described in terms of angles: starting at 0° the wave rises to a maximum at a phase of 90°, falls to zero at 180°, down to a minimum at 270°, and back to zero at 360° (which is the same as 0°).

All real sounds change over time, so the conversion into sine waves must be repeated frequently. Such time-varying frequency analyses of sounds have many applications: for instance, some of the parameters of the sound waves which compose an individual’s voice are unique to that person. Hence, such parameters can be used as acoustic ‘fingerprints’ (called voice prints), and automatically recognized by a machine.

Conversely, since every word has a unique sound (except for homophones like ‘sew’ and ‘so’), it is in principle possible for a machine to recognize them automatically, whoever speaks them. While the same word will be said differently by different speakers there are certain elements which vary only slightly, or predictably—hence our ability to recognize a word irrespective (within limits) of its speaker.

Automatic speech recognition is a long way from perfection however, and the main problem lies in deciding where one word ends and the next begins. To see how tricky this is, try listening to yourself saying ‘bread and butter’. You will probably actually hear something like ‘brembudder’ with no silences at all (and saying the phrase ‘properly’ neither feels nor sounds natural). The reason that humans can identify words so readily is that the sound patterns we hear are only one piece of evidence as to what is being said—as Chapter 4 will explain.

Sounds from nowhere

Since any sound can be analysed into sine waves, it follows that any sound can be synthesized from them: synthesizers that generate speech from sounds have been available for many years, and work far better than recognizers. In practice though, it is often far easier to generate speech by adding together fragments of pre-recorded or pre-generated sounds—a technique known as voice coding.

Today’s electronic systems can synthesize practically any sound at all, whether it occurs in nature or not—like the weird Shepard tone, which is produced by combining tones which fall in pitch but then fade out, while other, higher tones, fade in and themselves begin to fall. The impression is of a sound which continually falls—and yet gets no lower.

Usually, however, one does not want new sounds but improved versions of existing ones—a musical performance shorn of noise, for example. The selection of pre-recorded elements is also commonly used for non-speech sounds: one of the most celebrated electronic gadgets for the budding pop music producer in the 1960s was the Mellotron (1963)—a machine loaded with a library of short pieces of sound recorded on to magnetic tape, any of which could be quickly selected and played at a chosen frequency and volume.

Selecting sounds: filters

The commonest and easiest way to modify sound is through filtering: the removal or reduction of selected frequency ranges, accomplished either by electronic circuits or by software. High-pass filters cut out low frequencies, low-pass ones deal with high frequencies and band-pass filters banish both. A once familiar kind of variable filter is the graphic equalizer, a series of about seven slider bars on a hi-fi’s amplifier, which allows selected preset frequency ranges to be suppressed. The simpler ‘tone’ control similarly quietens either high (‘treble’) or low (‘bass’) frequencies.

A vast range of other facilities is also to be found in the computerized toolbox of the sound artist or engineer: such software can add reverberation or echo to a recording, create an artificial soundscape, or apply such real-time changes as shifting the frequencies of the sounds of a pop song recording before passing to a loudspeaker. This is the basis of a karaoke system, in which the notes of songs can be sharpened or flattened to match those that the user finds easiest to sing.