On the manifolds of spatial hearing Vikas C
- Slides: 37
On the manifolds of spatial hearing Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park NIPS 2006 workshop on novel applications of dimensionality reduction December 9, 2006
Human spatial hearing How are humans able to judge the direction of a sound source? Why do we have two ears? Why is the pinna shaped the way it is? 2
Plan of the talk • Human spatial hearing • Perceptual manifolds • Exploratory studies • Applications 3
How do humans localize sound source? § Primary cues § Interaural Time Difference (ITD) § Interaural Level Difference (ILD) § Explains localization only in the horizontal plane. § All points in the one half of the hyperboloid of revolution have the same ITD and IID. § [cone of confusion ] § Other cues § Pinna shape gives elevation cues for higher frequencies. § Torso and Head give elevation cues for lower frequencies. Source Left ear HEAD Intricate system to be completely modelled Right ear 4
It’s head, torso, and pinna 5
Head Related Transfer Function(HRTF) § § § Spectral filtering caused by the head, torso, and the pinna. HRIR—Head related impulse response. Can experimentally measure HRIR for all elevation and azimuth. Convolve the source signal with the measured HRIR to create virtual audio 6
Sample HRIR and HRTF Source directly in front of your right ear. 7
CIPIC Database § Public Domain HRIR Database § HRIRs sampled at 1250 points around the head § 45 subjects § Anthropometry measurements V. Ralph Algazi, Richard O. Duda, Dennis M. Thompson, Carlos Avendano, "The CIPIC HRTF database, "in WASSAP '01 (2001 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, Mohonk Mountain House, New Paltz, NY, Oct. 8 2001).
Interaural polar coordinate system Azimuth Elevation 9
Plan of the talk • Human spatial hearing • Perceptual manifolds • Exploratory studies • Applications 10
Manifold representation If we can unfold this low-dimensional manifold we have a good perceptual representation of the signal. 11
Our data matrix d Elevation manifold 50 points in a [HRTF=257 HRIR=200] dimensional space 200 x 50 257 x 50 13
Dimensionality Reduction methods • We used to following four methods – Principal Component Analysis (PCA) – Local Linear Embedding (LLE) – Isomap – Maximum Variance Unfolding (MVU) • We expect – The manifold to have an intrinsic dimensionality of 1. – The first embedded component to be monotonic with elevation. 14
HRTF elevation manifold PCA 15
HRTF manifold Isomap (K=3) 16
HRTF manifold Isomap (K=2) 17
HRTF manifold LLE (K=3) 18
HRTF manifold LLE (K=2) 19
HRTF manifold MVU 20
HRTF manifold MVU 21
HRIR elevation manifold PCA Isomap LLE MVU 22
Complete manifold Azimuth -45: 5: 45 Elevation -45: 5: 230 • We expect – The manifold to have an intrinsic dimensionality of 2. – The first two embedded components should show a grid like structure. 23
Complete manifold PCA 24
Complete manifold LLE (K=4) 25
Complete manifold LMVU (K=4) 26
Isomap (K=4) 27
HRIR manifold PCA Isomap Data representation -- manifold properties LLE, MVU - numerical problems 28
Plan of the talk • Human spatial hearing • Perceptual manifolds • Exploratory studies • Applications 29
Problem 1: Interpolation • HRTFs generally measured for a finite sampling grid of elevation and azimuth. • For a smooth virtual audio system we need to interpolate HRTFs. • HRTF measurement is a tedious and time consuming process. – Normally takes an hour. – Subject must be immobile. 30
Some prelimnary results 31
Problem 2: Distance metric • • • How to compare any two given HRTFs Perceptually inspired metric Psychoacoustical tests Squared log-magnitude error It is tough to decide what aspects of a given signal are perceptually relevant • Use geodesic distance 32
Distance on the manifold 33
Problem 3: Customization • HRTF measured for a particular person if used for different persons elevation perception is very poor. • Ear shape of each person is unique and also the anatomy. • Each person’s localizing capabilites are tuned to the shape of their ear and anatomy. • A big bottleneck for commercialization of spatial audio. 34
Style vs Content 35
Anthropometric measurements 36
Problem 4: Microphone calibration 37
Thank You ! | Questions ? 38
- Synventive manifolds
- Spatial data vs non spatial data
- Dr vikas bhatia
- Vikas lab thakur complex
- Dhara vikas sikkim poster
- Vikas kumar
- Dr vikas patel
- Pmagy nic.in
- Vikas kumar
- Vikas motwani
- Vagad vikas
- Surajya sarvangin vikas prakalp
- Dr vikas patel
- Dr vikas bansal
- Gramin sarva shiksha vikas kendra
- Enokj
- Gramin sarva shiksha vikas kendra
- Dr vikas patel
- Hươu thường đẻ mỗi lứa mấy con
- Diễn thế sinh thái là
- Vẽ hình chiếu vuông góc của vật thể sau
- Công của trọng lực
- 101012 bằng
- Tỉ lệ cơ thể trẻ em
- Thế nào là mạng điện lắp đặt kiểu nổi
- Lời thề hippocrates
- Vẽ hình chiếu đứng bằng cạnh của vật thể
- đại từ thay thế
- Quá trình desamine hóa có thể tạo ra
- Các môn thể thao bắt đầu bằng tiếng bóng
- Khi nào hổ mẹ dạy hổ con săn mồi
- Hát kết hợp bộ gõ cơ thể
- Dạng đột biến một nhiễm là
- Thế nào là sự mỏi cơ
- Trời xanh đây là của chúng ta thể thơ
- độ dài liên kết
- Chó sói
- Thiếu nhi thế giới liên hoan