Chapter 5 Electronic sound(1 / 1)

Sound to electricity: microphones

As invented by Charles Wheatstone in the 1820s, the microphone was a purely acoustical device made of two metal plates clamped to the ears by a springy rod and a length of dressmaker’s ribbon. To use it, one pressed one’s head against the sound source (Wheatstone helpfully suggested a boiling kettle) which, with a bit of luck, could be heard more clearly. To no one’s surprise but Wheatstone’s, it didn’t catch on.

The ancestor of today’s devices—which convert sounds to electricity rather than simply directing them to the ear as Wheatstone’s version did—was the carbon microphone, invented by David Hughes in the 1870s. In this, a thin metal plate (the diaphragm) compresses a vessel filled with carbon granules through which a current is passed, and the compressive force alters the electrical resistance. Although their performance was very poor, carbon microphones were used in telephones for decades.

Although there have been many microphone designs over the years, and several specialist types are available today, only three major ones are now in common use: the dynamic or moving coil microphone, the condenser microphone, and the piezoelectric microphone.

18. Dynamic microphone.

In a dynamic microphone (Figure 18) the diaphragm is attached to a coil of wire that surrounds a stationary magnet. A small voltage is induced in the coil when it moves, which in turn gives rise to a weak current. The low voltage means that the quality of the response is too poor for measurement work, so dynamic microphones are mainly to be found in concerts and recording studios.

In a condenser microphone (Figure 19), the diaphragm forms one plate of a capacitor (formerly called a condenser, hence the name). A capacitor is a pair of parallel metal plates with a thin layer of either air or some other material which does not conduct electricity (known as a dielectric) between them. Attaching one plate to the negative pole of a battery charges it with electrons. All metals contain free electrons, and those in the other plate are repelled by the electrical field arising from the large number now residing in the negative one. The repelled electrons ffow from that plate, to leave it positively charged. The whole capacitor, therefore, now has a voltage (called a polarizing voltage) across it, and the microphone is ready for use: the diaphragm plate moves in and out with the compressions and rarefactions of the sound waves that impinge on it. A large electrical resistance stops the charge from rapidly escaping, so instead the sound waves are transformed to patterns of voltage ripples. Condenser microphones have an excellent frequency response and are used as measurement microphones in laboratories and in SLMs. They also respond more rapidly to sudden sounds than do dynamic microphones.

19. Condenser microphone.

Crystal microphones and ceramic microphones exploit the piezoelectric eflect, in which quartz or some other crystalline material produces a voltage when slightly compressed. Most landline telephones use these, as do call-centre headsets.

Other types of microphone, of historical or specialist interest, include the following:

Electret microphones are made of permanently charged material (the electrical equivalent of a magnet). They function in a similar way to condenser microphones.

MEMS (microelectromechanical systems) microphones are condenser microphones etched directly on to silicon chips.

Just a few square millimetres in area, their cheapness and robustness means they are used in mobile phones, among many other applications.

Optical microphones use a shiny silicon membrane as a diaphragm, which reffects the light from a light-emitting diode (LED). A photo detector measures changes in the light when the membrane is vibrated by sound waves, and an electronic circuit converts these changes into an electrical signal. Such microphones are compact and robust and are unaflected by local electromagnetic fields, so they are used, for example, to allow communication between patient and stafl during MRI scans.

Pressure zone microphones (PZMs) are designed for use near a hard reffective surface (such as when placement directly on a stage ffoor is required). With a conventional microphone, reffection from the surface would interfere with the direct sounds to the microphone, but a PZM overcomes this by having its diaphragm so close to the surface that most wavelengths overlap.

Ribbon microphones use a strip (‘ribbon’) of metal rather than a diaphragm, and unlike the eardrum and most other kinds of microphone, they respond to the pressure diflerence between the ribbon’s sides, which mean that velocities with which air molecules in the sound wave move are detected, rather than sound pressure patterns. Such microphones are often used by commentators in noisy surroundings, because a sound which comes from all directions will impact both sides of the ribbon simultaneously, so that it will not respond. For this application, the microphones have a projection that is held to the upper lip, which helps guide the commentator’s voice so that it impinges on one side of the ribbon only.An alternative for capturing speech in noisy environments is the lavalier microphone, which is a small clip-on electret or dynamic microphone. Lavaliers have the advantage that they can readily be concealed under clothing.

A sound intensity probe consists of a pair of face-to-face microphones which measure the pressures of diflerent parts of the same sound wave. From these the molecular velocity and hence the sound intensity is calculated.

An important criterion in choosing a microphone is directionality: an ideal omnidirectional microphone is equally sensitive to sound from any direction, and is required to capture a full soundscape, for example. A unidirectional microphone, which picks up sound from one direction only, is ideal for picking up speech or song in noisy environments.

Except for the ribbon microphone and sound intensity probe, every type is naturally fairly omnidirectional in response, providing that the wavelengths of the sound waves that impinge on it are larger than the diaphragm. In order to make a microphone sensitive mainly to sound from immediately in front or behind (bidirectional), all that is needed is to allow the back as well as the front to be open to the air. The diaphragm will not move very much in response to sound waves arriving from all around it, because the rise and fall of pressure at the front would be very similar to that at the back. But any waves incident on the front or back face of the microphone will be picked up readily.

A simple way to achieve directionality is to mount a microphone at the focus of a reffector whose cross section is a parabola. This shape reffects incident sound waves on to the diaphragm, as long as they are smaller than the reffector. Alternatively, a microphone can be mounted at the end of a tube with slits along its sides, to make a shotgun (boom) microphone. Sound waves passing straight down the tube travel unimpeded to the microphone, but those from other directions enter through the slits. Each such sound will enter multiple slits, so many versions of it will be formed, each with a diflerent phase. The versions will therefore largely cancel each other out through destructive interference. Shotgun microphones are widely used for outdoor recording, often accompanying cameras. They are, however, highly frequency dependent.

Electricity to sound: loudspeakers

Loudspeakers are microphones in reverse: if a dynamic, condenser, crystal, or ceramic microphone is supplied with a varying current, it will vibrate to produce sound waves (such microphones are therefore referred to as reciprocal transducers). Most actual loudspeakers are dynamic microphones in reverse, and are called moving coil loudspeakers (Figure 20). As the name implies, an electrical signal is fed to a coil (the voice coil) attached to a diaphragm which is usually in the form of a cone. The coil surrounds a magnet and the electromagnetic field set up in the coil by the signal causes it, and the diaphragm, to move.

20. Loudspeaker.

Loudspeakers are intrinsically inefflcient devices: most of the electrical energy that passes into them appears as heat, with only about 1 per cent being audible. Hence, amplification is vital. Transistors make amplification a simple matter today, the main challenge being to ensure that each frequency is amplified to an appropriate extent, which, given the non-linear attributes of the hearing system, means that diflerent degrees of amplification must be applied to diflerent frequencies if the pitch envelope of the output sound is to remain unchanged. It is essential to deploy them with care. Quite apart from the deleterious eflects on hearing, loudspeakers can be easily damaged, especially by artificially produced sounds, which may have an extremely fast rise time. Also, if the microphone picks up the sound of the loudspeaker to which it is sending its output, a positive feedback loop is set up, resulting in an all too familiar squeal.

Loudspeakers are the least hi-fi link in the music production chain: though simple in principle, their design presents many practical challenges. The voice coil must return to exactly its start position when the signal falls to zero, with no oscillation beyond it, and yet must be free to move. The cone must hold its shape while it vibrates, be very light, yet rigid enough not to sag under gravity; large enough to move large volumes of air at low frequencies (where only powerful waves are audible), yet small enough so that they can move back and forth over 10,000 times a second at high frequencies. Also, the casing must not resonate at any frequency.

In practice, it is much easier to use speakers in groups, usually all in the same unit—a small tweeter for frequencies above 2 kHz, a larger mid-range speaker (50 Hz to 5 kHz), and a woofer (30 Hz to 800 Hz). For those who love lots of bass, there may also be a subwoofer (20 Hz to 200 Hz).

Subwoofers are usually active loudspeakers, which means they contain their own amplifiers (and hence need their own power supply). Most other speakers are passive, driven by a signal which has already been boosted by an amplifier within the hi-fi (or other) system.

A loudspeaker without a case is almost silent, for the simple reason that the high-pressure pulses generated at its front simply slip round to the back to fill the low-pressure area that has just formed there. Hence, loudspeakers may be mounted in an airtight box. If the box is small, however, the air in it resists being compressed as the diaphragm moves in. An alternative solution is to place the diaphragm in the centre of an annulus called a bafie. The bafie must be large enough such that, by the time the pressure pulse has travelled round it to the back of the diaphragm, the low-pressure area there has gone (or, to put it another way, the distance the sound waves must travel is longer than a quarter of a wavelength for the longest waves—the lowest frequencies—of interest).

Helmholtz resonance can be used to extend the low-frequency performance of a loudspeaker. A hole (port) is made in the front of the box in which the speaker is mounted, and the cavity then resonates at low frequencies. If the resonant frequency of the box lies below that of the speaker, a pressure pulse produced when the speaker diaphragm moves back (the backwave) will make its way within the box to emerge at the port, where it will be in phase with a new pulse just produced by the front of the diaphragm. These two in-phase pulses will reinforce each other. This applies to pulses which make up waves of any frequency that falls between the resonance frequency of the diaphragm and that of the box.

One disadvantage is that the pufls of air from the port can sometimes be heard, and another is that sounds are less crisp because each signal is followed by a short fading ‘tail’ of resonances. Also, pulses forming sound waves with frequencies below that of the port are cancelled out by the subsequent wave from the diaphragm. Frequencies above both resonances are neither enhanced nor reduced.

Our brains are so good at filling in gaps in sounds that we can exploit this to make do with quite basic loudspeakers. In nature, a set of tones of 200 Hz, 300 Hz, 400 Hz, and 500 Hz will almost always be harmonics (overtones) of a fundamental at 100 Hz. Because the brain’s hearing centre (see Figure 16) ‘knows’ this, it will confidently decide that the 100 Hz is actually there. But, if the set of tones is coming from a small loudspeaker, it’s likely that there will not in fact be a 100 Hz tone present. This eflect is known as the missing fundamental, and is why the lack of low frequencies in the Epidaurus theatre (Chapter 1) did not sound strange: the hearing centres of the audience interpolated them. It also explains the relative clarity of the earliest telephones, which did not transmit low frequencies well.

The introduction of eflective microphones, amplifiers, and loudspeakers revolutionized the record industry—and this revolution could be rapid since the records themselves did not need to be changed. Social changes soon followed: quite suddenly, almost anyone could access music, and select what they listened to as well.

According to sound studies expert Jonathan Sterne, such new sound media ‘called into question the very basis of experience and existence’. And the eflects on performers were profound: as music historian Robert Philip points out, many were aghast at the number of errors they heard in their recorded performances. Musicologist Mark Katz suggests that such performers became trapped in a ‘feedback loop’ in which they attempted to produce more and more ‘perfect’ performances, only to be disappointed all over again when they listened to recordings of the result. As a result, performance became less individualized, more standardized, and lost spontaneity. The experience of listening to recordings caused another kind of feedback too: violin vibrato, for example, was originally a phonographic eflect, but was soon being imitated by performers.

The next revolution in this field was the invention of stereophonic phonograph recording in 1933, achieved by recording the two channels on the two walls of the groove, at 90° to each other and 45° to the vertical. The introduction of stereo records meant that in principle the whole three-dimensional sound field of the original performance could be recreated,giving rise to questions of ideal speaker placement which have fascinated music bufls to this day. It also gave rise to the concept of ‘fidelity’, since what one was now doing was recreating an original performance. For a while at least—with advancing post-processing and mixing techniques, by the 1970s many pop pieces were never performed as such; much of their content was added after the band had gone home. For classical music however, faithful recording has remained key. Nevertheless, despite decades of interest by many millions of amateurs and professional listeners, recorders, performers, and players, ‘fidelity’ is still unquantifiable.

Storing sounds

Once the technology of amplification and loudspeaker design had been perfected, the main concern was the fragility of records. A whole culture developed concerning how to handle, house, clean, and—once correctly trained—play them. The auto-changer came as a great relief to some, though to others was regarded as either a spoilsport or an incipient record-damager. Partly because of this mystique and partly because of the high quality of some of their covers, vinyl records, and especially LPs (later called albums), were venerated as objects in a way that no other recording media have ever been, and a few are still sold to this day.

In 1964 analogue cassette tapes were introduced as a robust and compact alternative to discs, and to begin with were very popular: although recorded ones were readily available, many preferred to buy a vinyl record and (illegally!) transfer it to cassette to listen to while keeping the record pristine. Radio broadcasts could also be recorded in this way and the music centre, comprising radio, cassette recorder, and record player, became popular since it allowed cassettes to be recorded with a minimum of fuss.

There are two major disadvantages with tape, however: track selection requires time-consuming winding, and high-frequency hiss is unavoidable. The latter was reduced somewhat by the many variant Dolby systems on ofler, all of which work by recording a version of the track with the high frequencies boosted, and then suppressing them on playback. This is an example of a technique called companding (a portmanteau of compressing and expanding).

Just as stereo spawned new interests and behaviours, so did tape. Mixer tapes were one outcome, and another was the Sony Walkman, which at last allowed music lovers to listen to their music wherever they happened to be, while only annoying people sitting next to them.

But what was needed was to do away entirely with analogue recordings, that is, those in which a sound is stored as a continuous varying pattern (whether physical, as in a vinyl record,

or magnetic, as in a cassette tape). In a digital system, the recorded signal is encoded as a string of numbers, and once in that form it can be stored, transmitted, or copied with neither degradation of the signal nor increase in background noise.

At first glance, it may seem that in order to capture the intricacies of a complicated sound wave (like those in Figure 9), the amplitudes of a great many points of that wave must be measured and coded. In fact, the sound need only be sampled twice as frequently as the highest-frequency component that one wishes to preserve. So, to encode all frequencies in a signal up to 8 kHz, one must sample at 16 kHz (this is known as the Nyquist theorem). If one samples at a lower rate, the encoded data become distorted, an effect known as aliasing.

With the introduction of the compact disc (CD) in 1982, a wholesale shift from analogue to digital began. On a CD, digitally encoded signals are stored as patterns of dark pits in a shiny metal layer, which are scanned by a laser. The laser reffects from smooth areas but not from the pits; the CD player interprets the reffections as 1s and the non-reffections as 0s, and the strings of 1s and 0s encode the audio information as a sequence of binary numbers. Originally, CDs were claimed to have the impressive though not especially useful capability to play even if coated with marmalade. The claimants neglected to mention that the marmalade had to be applied to the upper side of the CD, as the coding is on the underside.

Nowadays of course, music is routinely bought, stored, and played without using physical media—audio files can simply be downloaded to a computer and played through a wide range of output devices. Often, the computer is part of an MP3 player (MP3 is short for ‘Moving Picture Experts Group-1 or Moving Picture Experts Group-2 Audio Layer III’).

The magic of MP3 audio files is how small they are: about one-tenth the size of equivalent CD files, which means a minute of MP3 music can be squeezed into a megabyte. This impressive reduction is achieved partly by a technique called Huflman coding, in which the symbols that appear most often are coded in the shortest possible way, and partly by coding more fully those frequency bands which people will notice most if they are disrupted (mainly speech frequencies), and providing only sketchy versions of those frequencies of less concern to us.

Because MP3 players take into account both the music and the listener in deciding what information to leave out when the song is compressed, Sterne has concluded that:

the MP3 carries within it practical and philosophical understandings of what it means to communicate, what it means to listen or speak, how the mind’s ear works, and what it means to make music. Encoded in every MP3 are whole worlds of possible and impossible sound and whole histories of sonic practices . . . . MP3 encoders build their files by calculating a moment-to-moment relationship between the changing contents of a recording and the gaps and absences of an imagined listener at the other end. The MP3 encoder works so well because it guesses that its imagined auditor is an imperfect listener, in less-than-ideal conditions. It often guesses right.

Effcient coding of music exploits the fact that, over billions of years, our hearing systems have evolved to respond to the sounds which are of most relevance to us. This, combined with limitations due to the nature of sound, restricts our immediate access to the world of sound to a frequency range which is only a thin slice of what is actually out there. Those unhearable realms are the subject of Chapter 6.