The Neural Basis of Speech Perception a view
The Neural Basis of Speech Perception – a view from functional imaging Sophie Scott Institute of Cognitive Neuroscience, University College London
This approach to speech perception • Speech is an auditory signal • It is possible to address the neural processing of speech within the framework of auditory cortical processing. • This is not synonymous with the entire language system. • If one is a skilled speaker of a language, then speech perception is obligatory.
Functional imaging • Where neural activity occurs, blood is directed. • Measure neural activity by tracking these changes in local blood flow. • Thus measuring mass synaptic activity • Poor temporal resolution • Essentially a comparison of blood flow changes across conditions - so the baseline comparisons are critical
Listening Wise et al, Lancet, 2001
Neuroanatomy of speech Speech production Speech perception
Scott and Johnsrude, 2003, from Romanski et al, 1999 caudal CM TS 3 TS 2 TS 1 Pro RM R pa A Ins CP medial lt A 1 MC Tpt sts CORE BELT RTM RT RTL AL sts PARABELT l sa RP rostral ML r do lateral
Core Belt Parabelt STGc AI CL R ML CBP RT Prefrontal cortex Dorsal prearcuate (8 a) Dorsal principal sulcus (46) Inferior convexity (12) AL STGr Orbital polar From Kaas and Hackett, 1999
Spatial representations tonotopy bandwidth Conspecific vocalisations
Anterior Posterior STP Tpt HG PT Ventral Assoc STS Human STP C B PB Assoc STS Monkey
Scott and Johnsrude, 2003 anterior medial lateral posterior AA MA LA A 1 PA STA ALA LP
Scott and Johnsrude, 2003 Sounds with harmonic structure against pure tones: Hall, Johnsrude et al. , 2002 anterior medial lateral Frequency modulated tones against unmodulated tones: Hall, Johnsrude et al. , 2002 Amplitude modulated noise against unmodulated noise: Giraud et al, 1999 posterior Spectral change against steady state sounds: Thivard et al, 2000
Hierarchical processing • Structure in sound is computed beyond primary auditory cortex • More complex structure (e. g. spectral change) processed further from PAC • How does this relate to speech processing?
speech rotated speech noise vocoded speech rotated noise vocoded speech
(Sp + VCo + RSp) - RVCo -60 -4 -10 Z = 6. 6 Left hemisphere 1 0 (Sp + VCo + RSp) - RVCo -64 -38 0 Z = 5. 7 1 0 -1 -1 -2 -2 Sp VCo RSp RVCo Anterior Sp VCo RSp RVCo (Sp + VCo) - (RSp + RVCo) -54 +6 -16 Z = 4. 7 (Sp + VCo) - (RSp + RVCo) -62 -12 Z = 5. 5 2 1 1 0 0 -1 -1 -2 Sp VCo RSp RVCo Scott, Blank, Rosen and Wise, 2000 Sp VCo RSp RVCo
Right hemisphere Anterior 2 (Sp + RSp) - (VCo + RVCo) +66 -12 0 Z = 6. 7 1 0 -1 Sp VCo RSp RVCo Scott, Blank, Rosen and Wise, 2000
Intelligibility
Plasticity within this system Naïve subjects were scanned before they could understand noise vocoded speech, then they were trained, then scanned again.
Flexibility in speech perception: learning to understand noise vocoded speech Activity to noise vocoded speech after a training period, relative to prior activity to NVC before the training period. Narain, Wise, Rosen, Matthews, Scott, under review. As well as left lateralised STS, there is involvement of left premotor cortex and the left anterior thalamus (which receive projections from the belt and parabelt).
Spectrograms of the stimuli (speech) 16 8 4 3 2 1 (rotated speech) 16 R 3 R
Intelligibility - behavioural data
Z=5. 6 x=-62 y=-10 z=8����� 0 Left 1 2 3 Z=4. 52 x=-64 y=-28 z=8��� 4 8 16 3 R 16 R Right Z=4. 73 x=-48 y=-16 z=-16 1 2 3 4 8 16 3 R 16 R Z=5. 96 x=64 y=-4 z=-2 1 2 3 4 8 16 3 R 16 R Scott, Rosen, Lang and Wise, 2006
Scott and Johnsrude, 2003 Sounds with harmonic structure against pure tones: Hall, Johnsrude et al. , 2002 anterior medial lateral Frequency modulated tones against unmodulated tones: Hall, Johnsrude et al. , 2002 Amplitude modulated noise against unmodulated noise: Giraud et al, 1999 posterior Peak responses to Intelligibility (Scott et al, 2006) Spectral change against steady state sounds: Thivard et al, 2000
Speech specific processing • Does not occur in primary auditory cortexd • Begins early in auditory cortex - in areas that also respond to AM • As we move forward down the STS, the responses become less sensitive to acoustic structure - resembles behavioural profile
Speech comprehension - The role of context e. g. , words recognised more easily in sentences • • “The ship sailed the sea” > “Paul discussed the dive”. • Can we identify the neural basis of this contextual modulation of speech comprehension? (Miller et al. , 1951; Boothroyd and Nittrouer, 1988; Grant and Seitz, 2000; Stickney and Assmann, 2001; Davis et al. , 2005)
(noise vocoding: Shannon et al. , 1995 predictability: Kalikow et al. , 1977)
Low predictability: log increase with more channels …‘Sue was interested in the bruise’… jonas obleser 27
High predictability: influence at intermediate number of channels …‘Sue was interested in the bruise’… …‘He caught the fish in his net’… jonas obleser Behav 2 low+high 28
Bottom-up processes: correlations with number of channels (cf. e. g. Binder et al. 2000; Scott et al. , 2000; Davis & Johnsrude 2003; Zekveld et al. , 2006) RFX p<0. 005 uncorrected, k>30 Obleser, Wise, Dresner, & Scott, 2007
Left-hemispheric array of brain regions when context affects comprehension Lateral Prefrontal (BA 8) Medial Prefrontal (BA 9) Angular Gyrus (BA 39) Ventral IFG (BA 47) Posterior Cingulate (BA 30) RFX p<0. 005 uncorrected, k>30 Obleser, Wise, Dresner, & Scott, 2007
findings • A range of brain areas outwith auditory cortex contribute to ‘top down’ semantic influences on speech perception • Further studies will be able to dissociate the contributions of different linguistic factors
Words are not the only things we say
Non speech sounds? x=54 Regions in red respond to noises and rotated noises Regions in yellow respond to noises and rotated noises
Right hemisphere Anterior 2 (Sp + RSp) - (VCo + RVCo) +66 -12 0 Z = 6. 7 1 0 -1 Sp VCo RSp RVCo
What drives lateral asymmetry? • Previous studies have not generally used ‘speech-like’ acoustic modulations • We aimed to manipulate speech stimuli to vary the amplitude and spectral properties of speech independently • Control for intelligibility • Do we see additive effects of amplitude and spectral modulations? • Are these left lateralised?
Steady spectrum, steady amplitude Steady spectrum, varying amplitude Varying spectrum, steady amplitude Varying spectrum, varying amplitude
Effect size Ideal additive effects Significantly more activated by stimuli with both AM and Sp. M Similar response to AM and Sp. M Down for flat amplitude and spectrum
Additive effects Flat AM Sp. MAM PET scanning, 16 runs, N=13, thresholded at p<0. 0001, 40 voxels
Additive effects Flat AM Sp. MAM PET scanning, 16 runs, N=13, thresholded at p<0. 0001, 40 voxels
But… • Is there a problem - were these stimuli really processed as speech? • To address this, 6 of the 13 subjects were pretrained on speech exemplars, and the speech stimuli were included as a 5 th condition.
A B C D E speech
A B C D E speech
Speech conditions Flat AM Sp. MAM N=6, thresholded at p<0. 0001, 40 voxels
Speech conditions Flat AM Sp. MAM N=6, thresholded at p<0. 0001, 40 voxels
Asymmetries in speech perception • Exist! • Are not driven by simple properties of the speech signal • Right - preferentially processes speechlike sounds - voices? • Left - processes linguistically relevant information
Posterior auditory areas • In primates, medial posterior areas show auditory and tactile responses • What do these areas do in speech processing in humans?
Wise et al, 2001, Brain Speaking and mouthing Wise, Scott, Blank, Murphy, Mummery and Warburton, 2001 This region, in the left posterior temporalparietal junction, responds when subject repeat a phrase, mouth the phrase silently, or go ‘uh uh’, over mentally rehearsing the phrase
Amount of DAF (0, 50, 125, 200 ms) Listening over silence
DAF peak on right 0 50 125 200
Neural basis of speech perception • Hierarchical processing of sound in auditory cortex • The anterior ‘what’ pathway is important in the perceptual processing of speech • Activity in this system can be modulated by top down linguistic factors • There are hemispheric asymmetries in speech perception - the left is driven by phonetic, lexical and linguistic properties: the right is driven by pitch variation, emotion and indexical properties • There are sensory motor links in posterior auditory areas - part of a ‘how’ pathway?
Scott, Current Opinions in Neurobiology, 2005 where what where how what Scott, in press
Carolyn Mc. Gettigan Disa Sauter Charlotte Jacquemot Sophie Scott Frank Eisner Richard Wise Charvy Narain Andrew Faulkner Hideki Takaso Narly Golestani Jonas Obleser Stuart Rosen
- Slides: 53