Speech and speech signals 1 Human vocal mechanism

  • Slides: 26
Download presentation
Speech and speech signals 1

Speech and speech signals 1

Human vocal mechanism Nasal cavity Hard palate Soft palate Oral cavity Tongue Throat Hyoid

Human vocal mechanism Nasal cavity Hard palate Soft palate Oral cavity Tongue Throat Hyoid bone Epiglottis Vocal cords Larynx Trachea 2

Cut-away wiev of a human larynx Pharynx cavity Arytenoid cartilages Thyroid cartilage Vocal cords

Cut-away wiev of a human larynx Pharynx cavity Arytenoid cartilages Thyroid cartilage Vocal cords 3 Trachea

4

4

Dynamic imaging 5 Mohammad (1999)

Dynamic imaging 5 Mohammad (1999)

Vocal track profiles for vowels I (eve) 6 A (father) I (it) O (obey)

Vocal track profiles for vowels I (eve) 6 A (father) I (it) O (obey) E (met) U (boot) 6

7

7

8

8

9

9

10

10

Profiles depending on sex: Male (left) and female (right) 11 http: //web 1. dcpa.

Profiles depending on sex: Male (left) and female (right) 11 http: //web 1. dcpa. org/brad_html/mrgallery. html

Crossections of vocal track for various consonants 12 http: //web 1. dcpa. org/brad_html/mrgallery. html

Crossections of vocal track for various consonants 12 http: //web 1. dcpa. org/brad_html/mrgallery. html

Speech signal • Speech signal: multi-dimentional acoustic stimulus, characterized in three domains: time, magnitude

Speech signal • Speech signal: multi-dimentional acoustic stimulus, characterized in three domains: time, magnitude and frequency. • Basic elements of speech signal: vowels, voiced/voiceless consonants, the larynx tone. • Such representation reflecting a speech signal is called a spectrogram. • The sound without particular meaning, but allowing to change the meaning of the word is called as a phoneme. 13

Speech signal - spectrogram 14

Speech signal - spectrogram 14

Schematic view of the vocal track 15

Schematic view of the vocal track 15

Speech production model From an acoustical point of view we can divide it into

Speech production model From an acoustical point of view we can divide it into 2 stages: generating and filtering. The basic assumption for vowel producing is that the signal generated at the level of pharynx is filtered linearly by the vocal tract. As a result the sound is emmited through oral cavity and the lips. The additional assumption is that the source and the filter are independent. 16

17

17

The time structure of the speech signal The time structure is complicated and dependent

The time structure of the speech signal The time structure is complicated and dependent on the particular vowel or consonant spoken at the moment. 18

Focus on the speech signal fragment 19

Focus on the speech signal fragment 19

Speech signal - formants • The very important phenomenon is an existence of formants

Speech signal - formants • The very important phenomenon is an existence of formants – the resonans caused by some resonators of human body (nose cavity, oral cavity etc. ). On the base of these resonans (maxima of energy) one can distinguish the particular phoneme, and the words (containing few phonemes). 20

The spectrum of voiced consonant 21

The spectrum of voiced consonant 21

Examples of spectra 22

Examples of spectra 22

Directivity of human mouth 23

Directivity of human mouth 23

Redundancy of speech • The important role for speech perception as well as for

Redundancy of speech • The important role for speech perception as well as for speech intelligibility has a redundancy (much more information as necessary) – inducing the speech perception more resistable for distortions and disturbances. 24

Model of speech perception • Speech signal -> peripherial human auditory filtering -> detection

Model of speech perception • Speech signal -> peripherial human auditory filtering -> detection of acoustical features of signal -> detection of language features -> lexical grouping -> putting into meaning and sense orders 25

literature • J. L. Flanagan „Speech analysis, synthesis and perception”, Springer – Verlag/ Berlin

literature • J. L. Flanagan „Speech analysis, synthesis and perception”, Springer – Verlag/ Berlin – Heidelberg – New York, 1965 26