Sound you tube Some from Heim Chap 13
Sound you tube Some from Heim Chap 13
Learning outcomes • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Describe the classes and subclasses of sound output and the attributes of each • Describe the classes and subclass of sound input and recognition and attributes of each 2
Hearing • Provides information about environment: distances, directions, objects etc. • outer ear • middle ear • inner ear – protects inner and amplifies sound – transmits sound waves as vibrations to inner ear – chemical transmitters are released and cause impulses in auditory nerve • Sound • pitch • loudness • timbre – sound frequency – amplitude – type or quality the human 1 3 • Physical apparatus:
Sound is vibration 1 -4 http: //www. hsc. csu. edu. au/ipt/mm_systems/3288/digitising_sound_answers. htm
Timbre is harmonic structure • A sine wave is all energy on the ‘first harmonic’ or ‘fundamental’ frequency (sounds like O) • Other shapes of sound wave come from a distribution of energy into other multiples of the fundamental http: //hyperphysics. phy-astr. gsu. edu/hbase/audio/geowv. html http: //www. sfu. ca/sonic-studio/handbook/Triangle_Wave. html 1 -5
Hearing (cont) • Humans can hear frequencies from 20 Hz to 15 k. Hz • less accurate distinguishing high frequencies than low. • Higher frequencies disappear as you get older • can attend to sounds over background noise. • for example, the cocktail party phenomenon. • Hearing aids disrupt this filtering • Hearing is involuntary • A sudden ‘grabs’ attention before we think • And some sounds are harder to ignore (e. g. baby crying) • ‘Listening’ is voluntary (largely) • Whether we choose to process the meaning, especially if the sound is language (although something like hearing your name is pretty well involuntary) the human 1 6 • Auditory system filters sounds
What if…. • You are in a noisy environment • Night clubbing • Your hearing is below average • You are deaf the human 1 7 • Phone call/ text message?
Sound versus Visual Sound exists in time and over space, vision exists in space and over time. (Gaver, 1989) - Sound is only there when it is playing/made - Vision is there until it is replaced 8
Sound Interaction • Computer Output/Generation (input to human) • Non speech • Music • Audio Icons and Earcons • Speech • Computer Input/Recognition • Speech • Non speech • Environmental • Music 9
Computer Output: Music • Can be pre-recorded or generated • Movies • Games • Immersive experiences • Activates your brain in a different way from language • Acts almost entirely independently from hand-to-eye processing 10
Generating music • Exciting area for artists • Everything from pseudo real to completely abstract • There are Jazz music generators that only skilled people can differentiate from actual musicians. • Serato – dj software (www. serato. com) • Auckland company doing fantastic things • Several UOA grads there 11
Auditory Icons and Earcons • The difference between these two is subtle • Auditory icons: emphasis on ‘natural’ sounds and metaphor with real world • e. g. sound of filling a bottle with water to match moving a large file • Earcons: ‘Artificial’ sounds (generated) • e. g. more abstract metaphorical relationship to action or purely a convention (like corporate colour schemes) Windows hardware fail insert remove 12
Auditory Icons and Earcons • Redundant Encoding • It aids memory by adding additional associations. • Can alert without interrupting (well, at least leaves the visual field clear) • An alterative communications channel. • Positive/Negative Feedback • Auditory alarms might be crucial to the safe operation of computer-operated machinery or mission-critical environments • Too many alarms • Annoying • Ignored 13
Using Sound in Interaction Design • Learnability of the mapping between the icon and the object represented • “Oink” and “bow wow” have high articulatory directness (low distance between ‘appearance’ and function [or denotation]) • A swishing sound accompanying a paintbrush tool also has high articulatory directness • A system beep, on the other hand, carries no information about what it denotes (but we may quickly learn to associate it with an error; and the square wave structure is a bit toward unpleasant, so it’s better for an error than feedback on success) 14
Can you remember earcons? • How many? • How often do you hear them? • Can you intuitively tell what these mean? On Off Sleep 15 Misrecognized Disambiguate
Speech Output • Eyes free operation • Alternative output channel • Good for checking your essays • Navigation is hard • Back tracking, • Finding location of a particular thing 16
Speech Output • Recorded • Menu choices for telephone systems • Books or other multimedia experiences • Generated (‘text-to-speech’, TTS) • Synthesizer built into Office • See http: //office. microsoft. com/en-nz/powerpoint-help/using-the-speaktext-to-speech-feature-HA 102066711. aspx • Google Translate has a nice one too (better, I think) • Can give pronunciation rules (the Google one sounds British to me, see also http: //www. bell-labs. com/project/tts/sable. html) • Still sound a little artificial • Best synthesizers have a physical model of the tongue and breath to give natural flow between phonemes 17
Sound Input • Speech • Environmental • Music 18
Speech Recognition • Two distinct applications: • Transaction • Transcription • Transaction • Telephone menu systems • Choose from a limited number of options, works ok • Automatic speech recognition (ASR) • Built into operating systems • Siri (i. Phone) and Android are ~~ usable • This is a triumph of Artificial Intelligence • Very difficult, ongoing research problem • Not just about recognizing phonemes but also finding the ‘right’ interpretation (helped e. g. by statistical word triple frequencies, but better if AI is ‘deeper’) 19
Searching Speech and Audio • Sound files do not afford easy opportunities for indexing and searching • Speech recognition can be used to transcribe speech files and create transcripts that can be searched like any other text file • So long as recognition accuracy is ok, which it isn't at the moment • Tune identification apps • Hum a bit of the tune and it tells you what it is! (e. g. Soundhog) 20
Summary • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Sound is transitory • Describe the classes and subclasses of sound output and the attributes of each • Non speech • Music • Earcons • Speech • Describe the classes and subclass of sound input and recognition and attributes of each • Speech • Transaction • Transcription 21
- Slides: 21