Auditory Perception and Sound Models Cecilia R Aragon

  • Slides: 58
Download presentation
Auditory Perception and Sound Models Cecilia R. Aragon IEOR 170 UC Berkeley Spring 2006

Auditory Perception and Sound Models Cecilia R. Aragon IEOR 170 UC Berkeley Spring 2006 IEOR 170

Acknowledgments • “How the Ear Functions, ” http: //www. archive. org/details/Howthe. Ea 1940 •

Acknowledgments • “How the Ear Functions, ” http: //www. archive. org/details/Howthe. Ea 1940 • Brian Bailey, http: //www-faculty. cs. uiuc. edu/~bpbailey/teaching/2006 -Spring/cs 414/index. html • Dan Russell, http: //www. kettering. edu/~drussell/demos. html • James Hillenbrand, http: //homepages. wmich. edu/~hillenbr/Auditory. Perception. ppt • Lawrence Rosenblum, http: //www. faculty. ucr. edu/~rosenblu/labindex. html (Mc. Gurk effect) • Andrew Green, http: //www. uwm. edu/~ag/teach_pdf/lecturenotes/perception/ Spring 2006 IEOR 170 2

Outline • • How the Ear Functions Physical Dimensions of Sound Perceptual Dimensions of

Outline • • How the Ear Functions Physical Dimensions of Sound Perceptual Dimensions of Sound Intensity and the Decibel Scale Pitch Perception Loudness Perception Timbre Perception Digitization of Sound Spring 2006 IEOR 170 3

How the Ear Functions http: //www. archive. org/details/Howthe. Ea 1940 Spring 2006 IEOR 170

How the Ear Functions http: //www. archive. org/details/Howthe. Ea 1940 Spring 2006 IEOR 170 4

Physical Dimensions of Sound Spring 2006 IEOR 170 5

Physical Dimensions of Sound Spring 2006 IEOR 170 5

Waves • Periodic disturbances that travel through a medium (e. g. air or water)

Waves • Periodic disturbances that travel through a medium (e. g. air or water) • Transport energy • “What is a Wave? ” Dan Russell, http: //www. kettering. edu/~drussell/Demos/waves-intro. html Spring 2006 IEOR 170 6

Sound • A longitudinal, mechanical wave – caused by a vibrating source • Pack

Sound • A longitudinal, mechanical wave – caused by a vibrating source • Pack molecules at different densities – cause small changes in pressure • Model pressure differences as sine waves Spring 2006 IEOR 170 7

Sound Waves • Pure Tones - simple waves • Harmonics - complex waves consisting

Sound Waves • Pure Tones - simple waves • Harmonics - complex waves consisting of combinations of pure tones (Fourier analysis) the quality of tone or its timbre (i. e. the difference between a given note on a trumpet and the same note on a violin) is given by the harmonics Spring 2006 IEOR 170 8

Changes in Air Pressure Spring 2006 IEOR 170 9

Changes in Air Pressure Spring 2006 IEOR 170 9

Process of Hearing (Transduction) Spring 2006 IEOR 170 10

Process of Hearing (Transduction) Spring 2006 IEOR 170 10

Frequency (temporal) Theory • Periodic stimulation of membrane matches frequency of sound – one

Frequency (temporal) Theory • Periodic stimulation of membrane matches frequency of sound – one electrical impulse at every peak – maps time differences of pulses to pitch • Firing rate of neurons far below frequencies that a person can hear – Volley theory: groups of neurons fire in wellcoordinated sequence Spring 2006 IEOR 170 11

Place Theory • Waves move down basilar membrane – stimulation increases, peaks, and quickly

Place Theory • Waves move down basilar membrane – stimulation increases, peaks, and quickly tapers – location of peak depends on frequency of the sound, lower frequencies being further away Spring 2006 IEOR 170 12

Physical Dimensions of Sound • Amplitude – height of a cycle – relates to

Physical Dimensions of Sound • Amplitude – height of a cycle – relates to loudness • Wavelength (w) – distance between peaks • Frequency ( λ ) – cycles per second – relates to pitch – λ w = velocity • Most sounds mix many frequencies & amplitudes Spring 2006 Sound is repetitive changes in air pressure over time IEOR 170 13

Perceptual Dimensions of Sound Spring 2006 IEOR 170 14

Perceptual Dimensions of Sound Spring 2006 IEOR 170 14

Auditory Perception Auditory perception is a branch of psychophysics. Psychophysics studies relationships between perception

Auditory Perception Auditory perception is a branch of psychophysics. Psychophysics studies relationships between perception and physical properties of stimuli. Physical dimensions: Aspects of a physical stimulus that can be measured with an instrument (e. g. , a light meter, a sound level meter, a spectrum analyzer, a fundamental frequency meter, etc. ) Perceptual dimensions: These are the mental experiences that occur inside the mind of the observer. These experiences are actively created by the sensory system and brain based on an analysis of the physical properties of the stimulus. Perceptual dimensions can be measured, but not with a meter. Measuring perceptual dimensions requires an observer (e. g. , a listener). Spring 2006 IEOR 170 15

Visual Psychophysics: Perceptual Dimensions Physical Properties of Light Hue Brightness Shape Wavelength Luminance Contour/Contrast

Visual Psychophysics: Perceptual Dimensions Physical Properties of Light Hue Brightness Shape Wavelength Luminance Contour/Contrast Auditory Psychophysics: Perceptual Dimensions Physical Properties of Sound Pitch Loudness Timbre (sound quality) Fundamental Frequency Intensity Spectrum Envelope/Amp Env Spring 2006 IEOR 170 16

The Three Main Perceptual Attributes of Sound • Pitch (not fundamental frequency) • Loudness

The Three Main Perceptual Attributes of Sound • Pitch (not fundamental frequency) • Loudness (not intensity) • Timbre (not spectrum envelope or amplitude envelope) The terms pitch, loudness, and timbre refer not to the physical characteristics of sound, but to the mental experiences that occur in the minds of listeners. Spring 2006 IEOR 170 17

Perceptual Dimensions • Pitch – higher frequencies perceived as higher pitch – humans hear

Perceptual Dimensions • Pitch – higher frequencies perceived as higher pitch – humans hear sounds in 20 Hz to 20, 000 Hz range • Loudness – higher amplitude results in louder sounds – measured in decibels (db), 0 db represents hearing threshold Spring 2006 IEOR 170 18

Perceptual Dimensions (cont. ) • Timbre – complex patterns added to the lowest, or

Perceptual Dimensions (cont. ) • Timbre – complex patterns added to the lowest, or fundamental, frequency of a sound, referred to as spectrum envelope – spectrum envelopes enable us to distinguish musical instruments • Multiples of fundamental frequency give music • Multiples of unrelated frequencies give noise Spring 2006 IEOR 170 19

Sound Intensity and the Decibel Scale Spring 2006 IEOR 170 20

Sound Intensity and the Decibel Scale Spring 2006 IEOR 170 20

Sound Intensity • Intensity (I) of a wave is the rate at which sound

Sound Intensity • Intensity (I) of a wave is the rate at which sound energy flows through a unit area (A) perpendicular to the direction of travel P measured in watts (W), A measured in m 2 • Threshold of hearing I 0 is at 10 -12 W/m 2 • Threshold of pain is at 1 W/m 2 Spring 2006 IEOR 170 21

Decibel Scale • Describes intensity relative to threshold of hearing based on multiples of

Decibel Scale • Describes intensity relative to threshold of hearing based on multiples of 10 Spring 2006 IEOR 170 22

Decibels of Everyday Sounds Sound Spring 2006 Decibels Rustling leaves 10 Whisper 30 Ambient

Decibels of Everyday Sounds Sound Spring 2006 Decibels Rustling leaves 10 Whisper 30 Ambient office noise 45 Conversation 60 Auto traffic 80 Concert 120 Jet motor 140 Spacecraft launch 180 IEOR 170 23

Interpretation of Decibel Scale • 0 d. B = threshold of hearing (TOH) •

Interpretation of Decibel Scale • 0 d. B = threshold of hearing (TOH) • 10 d. B = 10 times more intense than TOH • 20 d. B = 100 times more intense than TOH • 30 d. B = 1000 times more intense than TOH • An increase in 10 d. B means that the intensity of the sound increases by a factor of 10 • If a sound is 10 x times more intense than another, then it has a sound level that is 10*x more decibels than the less intense sound Spring 2006 IEOR 170 24

Loudness from Multiple Sources • Use energy combination equation where L 1, L 2,

Loudness from Multiple Sources • Use energy combination equation where L 1, L 2, …, Ln are in d. B Spring 2006 IEOR 170 25

Exercises • Show that the threshold of hearing is at 0 d. B •

Exercises • Show that the threshold of hearing is at 0 d. B • Show that the threshold of pain is at 120 d. B • Suppose an electric fan produces an intensity of 40 d. B. How many times more intense is the sound of a conversation if it produces an intensity of 60 d. B? • One guitar produces 45 d. B while another produces 50 d. B. What is the d. B reading when both are played? • If you double the physical intensity of a sound, how many more decibels is the resulting sound? Spring 2006 IEOR 170 26

Pitch Perception Spring 2006 IEOR 170 27

Pitch Perception Spring 2006 IEOR 170 27

Pitch and Fundamental Frequency All else being equal, the higher the F 0, the

Pitch and Fundamental Frequency All else being equal, the higher the F 0, the higher the perceived pitch. Lower F 0, lower pitch Spring 2006 Higher F 0, higher pitch IEOR 170 28

Pitch Perception The ear is more sensitive to F 0 differences in the low

Pitch Perception The ear is more sensitive to F 0 differences in the low frequencies than the higher frequencies. This means that: 300 vs. 350 ¹ 3000 vs. 3050 That is, the difference in perceived pitch (not F 0) between 300 and 350 Hz is NOT the same as the difference in pitch between 3000 and 3050 Hz, even though the physical differences in F 0 are the same. 300 -350: Spring 2006 3000 -3050: IEOR 170 29

Music Perception • Tone height: A sound quality whereby a sound is heard to

Music Perception • Tone height: A sound quality whereby a sound is heard to be of higher or lower pitch; monotonically related to frequency • Tone chroma: A sound quality shared by tones that have the same octave interval • Musical helix: Can help visualize musical pitch Spring 2006 IEOR 170 30

Harmonic Frequencies 1 f • Strings or pipes (trombone, flute organ) all have resonant

Harmonic Frequencies 1 f • Strings or pipes (trombone, flute organ) all have resonant frequencies. • They may vibrate at that frequency or some multiple of it • All instruments and voices carry some harmonics and dampen others 2 f 1 octave 3 f 4 f 2 octaves 8 f 3 octaves Spring 2006 Length of string or pipe IEOR 170 31

Loudness Perception Spring 2006 IEOR 170 32

Loudness Perception Spring 2006 IEOR 170 32

Loudness and Intensity All else being equal, the higher the intensity, the greater the

Loudness and Intensity All else being equal, the higher the intensity, the greater the loudness. Higher intensity, higher loudness Spring 2006 Lower intensity, lower loudness IEOR 170 33

The relationship between intensity and loudness Doubling intensity does not double loudness. In order

The relationship between intensity and loudness Doubling intensity does not double loudness. In order to double loudness, intensity must be increased by a factor of 10, or by 10 d. B [10 x log 10 (10) = 10 x 1 = 10 d. B]. This is called the 10 d. B rule. Two signals differing by 10 d. B: (500 Hz sinusoids) Note that the more intense sound is NOT 10 times louder even though it is 10 times more intense. The 10 d. B rule means that a 70 d. B signal is twice as loud as a 60 d. B signal, four times as loud as a 50 d. B signal, eight times as loud as a 40 d. B signal, etc. A 30 d. B hearing loss is considered mild -- just outside the range of normal hearing. Based on the 10 d. B rule, how much is loudness affected by a 30 d. B hearing loss? (Answer: 1/8 th. But note that this does not mean that someone with a 30 d. B loss will have 8 times more difficulty with speech understanding than someone with normal hearing. ) Spring 2006 IEOR 170 34

Loudness Perception Loudness is strongly affected by the frequency of the signal. If intensity

Loudness Perception Loudness is strongly affected by the frequency of the signal. If intensity is held constant, a midfrequency signal (in the range from ~1000 -4000 Hz) will be louder than lower or higher frequency signals. 125 Hz, 3000 Hz, 8000 Hz The 3000 Hz signal should appear louder than the 125 or the 8000 signal, despite the fact that their intensities are equal. Spring 2006 IEOR 170 35

Loudness and Pitch • More sensitive to loudness at mid frequencies than at other

Loudness and Pitch • More sensitive to loudness at mid frequencies than at other frequencies – intermediate frequencies at [500 hz, 5000 hz] • Perceived loudness of a sound changes based on the frequency of that sound – basilar membrane reacts more to intermediate frequencies than other frequencies Spring 2006 IEOR 170 36

Audibility Thresholds Spring 2006 IEOR 170 37

Audibility Thresholds Spring 2006 IEOR 170 37

Fletcher-Munson Contours Each contour represents an equal perceived sound Spring 2006 IEOR 170 38

Fletcher-Munson Contours Each contour represents an equal perceived sound Spring 2006 IEOR 170 38

Human Auditory Spectrum • < 20 Hz - infrasound • > 20 KHz -

Human Auditory Spectrum • < 20 Hz - infrasound • > 20 KHz - ultrasound • human auditory range decreases with age • TV 17. 7 KHz horizontal scanning frequency • “ultrasonic” cleaning devices, burglar alarms (20 -40 KHz) • CD 20 KHz cutoff, LP 60 -80 KHz Spring 2006 IEOR 170 39

Exposure to Loud Noise Spring 2006 IEOR 170 40

Exposure to Loud Noise Spring 2006 IEOR 170 40

Timbre Perception Spring 2006 IEOR 170 41

Timbre Perception Spring 2006 IEOR 170 41

Timbre, also known as sound quality or tone color, is oddly defined in terms

Timbre, also known as sound quality or tone color, is oddly defined in terms of what it is not: When two sounds are heard that match for pitch, loudness, and duration, and a difference can still be heard between the two sounds, that difference is called timbre. For example: a clarinet, a saxophone, and a piano all play a middle C at the same loudness and same duration. Each of these instruments has a unique sound quality. This difference is called timbre, tone color, or simply sound quality. There also many examples of timbre difference in speech. For example, two vowels (e. g. , /å/ and /i/) spoken at the same loudness and same pitch differ from one another in timbre. There are two physical correlates of timbre: spectrum envelope amplitude envelope Spring 2006 IEOR 170 42

Timbre and Spectrum Envelope Timbre differences between one musical instrument and another are partly

Timbre and Spectrum Envelope Timbre differences between one musical instrument and another are partly related to differences in spectrum envelope -- differences in the relative amplitudes of the individual harmonics. In the examples above, we would expect all of these sounds to have the same pitch because the harmonic spacing is the same in all cases. The timbre differences that you would hear are controlled in part by the differences in the shape of the spectrum envelope. Spring 2006 IEOR 170 43

Six Synthesized Sounds Differing in Spectrum Envelope Note the similarities in pitch (due to

Six Synthesized Sounds Differing in Spectrum Envelope Note the similarities in pitch (due to constant F 0/harmonic spacing) and the differences in timbre or sound quality. Spring 2006 IEOR 170 44

Vowels Also Differ in Spectrum Envelope Shown here are the smoothed envelopes only (i.

Vowels Also Differ in Spectrum Envelope Shown here are the smoothed envelopes only (i. e. , the harmonic fine structure is not shown) of 10 American-English vowels. * Note that each vowel has a unique shape to its spectrum envelope. Perceptually, these sounds differ from one another in timbre. Purely as a matter of convention, the term timbre is seldom used by phoneticians, although it applies just as well here as it does in musical acoustics. In phonetics, timbre differences among vowels are typically referred to as differences in vowel quality or vowel color. *From Hillenbrand, J. M, Houde, R. A. , Clark, M. J. , and Nearey, T. M. Vowel recognition from harmonic spectra. Acoustical Society of America, Berlin, March, 1999. Spring 2006 IEOR 170 45

Aperiodic sounds can also differ in spectrum envelope, and the perceptual differences are properly

Aperiodic sounds can also differ in spectrum envelope, and the perceptual differences are properly described as timbre differences. Spring 2006 IEOR 170 46

Amplitude Envelope • Timbre also affected by amplitude envelope • sometimes called the amplitude

Amplitude Envelope • Timbre also affected by amplitude envelope • sometimes called the amplitude contour or energy contour of the sound wave • the way sounds are turned on and turned off Leading edge = attack Trailing edge = decay The attack especially has a large effect on timbre. Spring 2006 IEOR 170 47

Music examples (timbre differences related to amplitude envelope) Plucked vs. bowed stringed instruments The

Music examples (timbre differences related to amplitude envelope) Plucked vs. bowed stringed instruments The damping pedal on a piano The difference in sound quality between a hammered string (e. g. , a piano) and a string that is plucked by a quill (e. g. , a harpsichord). The timbre differences that distinguish one musical instrument from another appear to be more closely related to differences in amplitude envelope -- and especially the attack -- than to the shape of the spectrum envelope (although both play a role). For example, when the amplitude contour of an oboe tone is imposed on a violin tone, the resulting tone sounds more like an oboe than a violin. * *White, G. D. The Audio Dictionary, 1987, Seattle: University of Washington Press. Spring 2006 IEOR 170 48

Same melody, same spectrum envelope (if sustained), different amplitude envelopes (i. e. , different

Same melody, same spectrum envelope (if sustained), different amplitude envelopes (i. e. , different attack and decay characteristics). Note differences in timbre or sound quality as the amplitude envelope varies. Spring 2006 IEOR 170 49

Timbre differences related to amplitude envelope also play a role in speech. Note the

Timbre differences related to amplitude envelope also play a role in speech. Note the differences in the shape of the attack for /b/ vs. /w/ and /S/ vs. /t. S/. abrupt attack more gradual attack Spring 2006 IEOR 170 50

Hearing Lips and Seeing Voices (The Mc. Gurk Effect) http: //www. faculty. ucr. edu/~rosenblu/lab-index.

Hearing Lips and Seeing Voices (The Mc. Gurk Effect) http: //www. faculty. ucr. edu/~rosenblu/lab-index. html Spring 2006 IEOR 170 51

Digitization of Sound [Steinmetz and Nahrstedt] Spring 2006 IEOR 170 52

Digitization of Sound [Steinmetz and Nahrstedt] Spring 2006 IEOR 170 52

Digitization Microphones, video cameras produce analog signals (continuous-valued voltages) To get audio or video

Digitization Microphones, video cameras produce analog signals (continuous-valued voltages) To get audio or video into a computer, we must digitize it (convert it into a stream of numbers) So, we have to understand discrete sampling (both time and voltage) Spring 2006 IEOR 170 53

Discrete Sampling • Sampling -- divide the horizontal axis (the time dimension) into discrete

Discrete Sampling • Sampling -- divide the horizontal axis (the time dimension) into discrete pieces. Uniform sampling is ubiquitous. • Quantization -- divide the vertical axis (signal strength) into pieces. Sometimes, a non-linear function is applied. · 8 bit quantization divides the vertical axis into 256 levels. 16 bit gives you 65536 levels. Spring 2006 IEOR 170 54

Sampling (in time) • Measure amplitude at regular intervals • How many times should

Sampling (in time) • Measure amplitude at regular intervals • How many times should we sample? Spring 2006 IEOR 170 55

Nyquist Theorem • Suppose we are sampling a sine wave. How often do we

Nyquist Theorem • Suppose we are sampling a sine wave. How often do we need to sample it to figure out its frequency? · If we sample at 1 time per cycle, we can think it's a constant. Spring 2006 IEOR 170 56

Nyquist Rate · If we sample at 1. 5 times per cycle, we can

Nyquist Rate · If we sample at 1. 5 times per cycle, we can think it's a lower frequency sine wave. · Nyquist rate -- "For lossless digitization, the sampling rate should be at least twice the maximum frequency response. " Spring 2006 IEOR 170 57

Digital Audio • Standard music CD: – – – Sampling Rate: 44. 1 k.

Digital Audio • Standard music CD: – – – Sampling Rate: 44. 1 k. Hz 16 -bit samples 2 -channel stereo Data transfer rate = 2 16 44, 100 = 1. 4 Mbits/s 1 hour of music = 1. 4 3, 600 = 635 MB Spring 2006 IEOR 170 58