Deep Room Recognition Using Inaudible Echos ubicomp 18

Deep Room Recognition Using Inaudible Echos ubicomp '18, Sempter, 2018 QUN SONG, CHAOJIE GU, Rui Tan, Nanyang Technological University

ABSTRACT Ø increasing need of localization by mobile application, Ø output 2 ms single-tone inaudible chirp by speaker, Ø capture the echos by microphone, Ø a narrow inaudible band for 0. 1 s, Ø learning to capture the subtle fingerprints, Ø two layer nn achieve best performance, Ø design a Room. Recoginition cloud service and client, Ø infrastucture-free and no add-on hardware, Ø robustness against interfering sounds.

Physical perspective

Software Framework

CONTENTS 01 INTRODUCTION 02 RELATED WORK 03 MEASUREMENT STUDY 04 DEEP ROOM RECOGNITION 05 DEEP ROOM RECOGNITION CLOUD SERVICE 06 Performance Evaluation 07 DISCUSSIONS 08 CONCLUSION

01 SECTION INTRODUCATION

1. INTRODUCTION Ø Indoor localization, Ø Various RF, VL, imaging, acoustics, geomagnetism, Ø Each sensing modality bears limitation, Ø this paper design room-level localization approach Ø for off-the-shelf smartphones using audio system

1. INTRODUCTION Ø Room level localization desirable widely Ø The requirement of existing indoor localization Ø R 1. dedicated/existing infrastasture Ø R 2. add-on equipment Ø R 3. (training)process for data Ø acoustic-based room-level localization Ø only R 3 Ø casted into a supervised multiclassification problem Ø easy training data collection(enter, click, key), no-expert

1. INTRODUCTION Ø existing indoor localization system incorporated acoustic sensing Ø Surroud. Sense(only outperform random guessing, wide audible band, susceptible ambient) Ø Two basic challenge to room recoginition system Ø privacy concern Ø 20 khz annoyance Ø limited information about the measured room

1. INTRODUCTION Ø The emerging DL methods demonstrated in image classification, speech recognition, NLP and etc, Ø This paper present the design of a deep room recognition approach, Ø The experiment shows a two-layer CNN fed with spectrogram of the captured inaudible echos achieves the best performance, Ø 100 ms audio recording after a 2 ms 20 khz singletone chirp

1. INTRODUCTION Ø The contributions Ø in-depth measurement study on rooms' acoustic responses to a short-time single-tone inaudible chirp, Ø design of deep model, Ø evaluation of our approach in real-world, Ø engineer implement,

02 SECTION RELATED WORK

2. RELATED WORK Ø infrastructure-dependent Ø RF infrastructure, 802. 11[8, 18], cellular[20], FM radio[10], aircraft broadcast[15], Ø WALRUS inaudible beacons to localize mobile, Ø infrastructure-free Ø geomagnetism[11], imaging[13], acoustics Ø acoustics divided into Ø passive sensing, surroundsense[36] Ø active sensing, Ø senmantic location[16, 24, 34] Ø Room. Sense[31], audible, 0. 68 s, SVM, MFCC

2. RELATED WORK Ø active acoustic sensing Ø ranging, Beep[30], Sword. Fight[40], Ø moving object tracking, finger[38], breath[28], a human body using inaudible chirp[29], Ø gesture recognition, Soundwave[17], dopplershifted reflection.

03 SECTION MEASUREMENT STUDY

3. MEASUREMENT STUDY Ø lab, measured room 'Lx', opern area 'OA'

3. 1 Passive Acoustic Sensing Ø confusion matrix of batphone

3. 2 Rooms' response to single-tone chirps Ø use loudspeaker to emit an acoustic chirp and use microphone to capture the measured room's response, Ø conduct measurement study to obtain insightful observations on the rooms' responses, Ø every 100 ms emit a time duration of 2 ms chirp wave Ø 44. 1 khz sample rate Ø chirp not overlap echos 34 cm away

3. 2 Rooms' response to single-tone chirps Ø existing active acoustic sensing, sine sweep chirp, maximum length sequence, multi-tone chirp, Ø propose use a sine-tone inaudible chirp to avoid the annoyance to the user and improve the robustness of the room recognition against interfering sounds,

3. 2 Rooms' response to single-tone chirps Ø 20 khz and 21. khz chirp

3. 2 Rooms' response to single-tone chirps Ø The decreasing trend indicates that audio system's performance decrease with the frequency. Ø choose 20 khz, the lowest inaudible frequency, Ø android Near Ultrasound Tests

3. 2 Rooms' response to single-tone chirps Ø 2 ms, 0. 5 ms Ø 97. 5 ms echo data period Ø 13. 8 ms Ø in L 3 and outdoor Ø corrletion relationship

3. 2 Rooms' response to single-tone chirps Ø Frequency-domain analysis Ø guess different rooms have different freqency responses Ø 97. 5 ms, 10. 3 hz

04 SECTION DEEP ROOM RECOGNITION

4. 1 Background and Problem Statement Ø Traditional classification algorithm Ø bayes classifier and SVM Ø dimension reduction Ø MFCC, PLP coefficient

4. 2 Raw Data Format/Deep Model Ø PSD and spectrogram are two possible raw data formats for deep learning Ø FFT on the 4300 data points in the echo data Ø use 147 points in the [19. 5, 20. 5]khz

4. 2 Raw Data Format/Deep Model Ø 22000 samples from 22 rooms, every sample 100 ms Ø training set, validation set and testing set

4. 3 Hyperparameter Settings

05 SECTION DEEP ROOM RECOGNITION CLOUD SERVICE

5. 1 System Overview Ø Room. Recognition support a participatory learning mode, in which CNN is retrained when a mobile client uploads labeled training samples Ø existing studies shows that smartphones and even lower-end Io. T platform can run deep models

06 SECTION PERFORMANCE EVALUATION

6. 1 Evaluation Methodology Ø low performance of passive acoustic sensing Ø only compare Room. Recognize with other based on active acoustic sensing, Room. Sense

07 SECTION DISCUSSIONS

7 DISCUSSIONS Ø Moving people in target rooms results in Room. Recognize's performance because of human body Ø To address this issue, other sensing modality geomagnetism, in future, Ø Similar room worth study

08 SECTION CONCLUSION

8 CONCLUSION Ø To address the challenges of limited information carried by the room's response in such a narrow band, applied deep learning capture the subtile difference in room's reponses. Ø two-layer CNN fed with the spectrogram of the echo Ø Room. Recognize

Software Framework

Thank you!