Group 48 Bird Audio Classification Hadar Abukrat A

Group 48: Bird Audio Classification Hadar Abukrat (A 11731118) Indu Korambath (A 12652139)

Background ● Apply deep learning to assist in classification of birds by their sounds ● Improving classification on birds by their sounds, would assist researchers in understanding certain species and the environments they live in better, especially in matters concerning conservation status

Literature survey Common Blackbird ● ● Stowell, D. , Wood, M. D. , Pamuła, H. , Stylianou, Y. , & Glotin, H. (2018, November 3). Automatic acoustic detection of birds through deep learning: The first Bird Audio Detection challenge. Retrieved from https: //besjournals. onlinelibrary. wiley. com/doi/full/10. 1111/2041 -210 X. 13103 Martinsson, J. (2017). Bird Species Identification using Convolutional Neural Networks. [Master’s thesis, Chalmers University of Technology and University of Gothenburg] Kahl, Stefan & Wilhelm-Stein, Thomas & Hussein, Hussein & Klinck, Holger & Kowerko, Danny & Ritter, Marc & Eibl, Maximilian. (2017). Large-Scale Bird Sound Classification using Convolutional Neural Networks. xeno-canto. org

Dataset details: xeno-canto ● ● ● 10 species selected by us based on number of recordings and quality Downloaded data through command line using github script Using 600 samples per bird type, around 10 seconds of audio 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. European Robin Common Blackbird European Robin Eurasian Wren House Sparrow Great Spotted Woodpecker Eurasian Skylark Tawny Owl Common Nightingale Northern Raven Eurasian Jay

Data Preprocessing (Python) ● ● Torchaudio ○ Torchaudio. load ○ Num_frames = 227556 Waveform channels Noise Filtering ○ Hi-pass filter ○ Lo-pass filter Resampling Eurasian Wren

Data Preprocessing (Matlab) ● ● ● Audio Datastore Toolbox ○ Audioread Merging channels into mono Creating a table of all the data to go into diagnostic. Feature. Designer House Sparrow

Feature Extraction Python: Torchaudio Spectrogram: visual representation of the frequencies over time of each signal Mel-Spectrogram: spectrogram frequencies converted to the mel scale; used to get a perceived scale of the pitches in the audio, in order to have an idea of the distances between them MFCC: MFCC features represent phonemes (distinct units of sound) as the shape of the vocal tract (which is responsible for sound generation) is manifest in them. Spectrogram Mel-Spectrogram

Models ● Random Forest ● CNN ○ Mnist ■ https: //keras. io/examples/mnist_cnn/ ■ https: //medium. com/gradientcrescent/urban-sound-classification -using-convolutional-neural-networks-with-keras-theory-and 486 e 92785 df 4 ○ Urban Sound ○ Cube. Run (as seen in Martinsson thesis) ○ Large Scale Bird Classification CNNs Great Spotted Woodpecker

Random Forest All Features Spectrogram Mel-Spectrogram MFCC Fit to train 0. 99958 0. 99979 0. 99958 Fit to test 0. 651 0. 6325 0. 6316 0. 463

Random Forest All Features Spectrogram Mel-Spectrogram MFCC Fit to train 0. 99958 0. 99979 0. 99958 Fit to test 0. 651 0. 6325 0. 6316 0. 463

Random Forest All Features Spectrogram Mel-Spectrogram MFCC Fit to train 0. 99958 0. 99979 0. 99958 Fit to test 0. 651 0. 6325 0. 6316 0. 463

CNN MNIST: Conv 2 D, 32 x 3 x 3, relu Conv 2 D, 64 x 3 x 3, relu Max. Pooling 2 D, size 2 Dropout, 0. 5 Flatten Dense, 128, relu Dropout, 0. 5 Dense, 10, softmax Accuracy: 0. 25583333 Urban Sound Classification: Conv 2 D, 32 x 3 x 3, zero-padding Dense, 32, relu Conv 2 D, 64 x 3 x 3 Dense, 64, relu Max. Pooling 2 D, size 2 Dropout, 0. 25 Conv 2 D, 64 x 3 x 3, zero-padding Dense, 64, relu Conv 2 D, 64 x 3 x 3 Dense, 64, relu Max. Pooling 2 D, size 2 Dropout, 0. 5 Conv 2 D, 128 x 3 x 3, zero-padding Dense, 128, relu Conv 2 D, 128 x 3 x 3 Dense, 128, relu Max. Pooling 2 D, size 2 Dropout, 0. 5 Flatten Dense, 512, relu Dropout, 0. 5 Dense, 10, softmax Eurasian Skylark Accuracy: 0. 30476192

Cube. Run From John Martinssons github for bird-species-classification Accuracy: 0. 24583334 Cube. Run: Batch. Normalization Conv 2 D, 64 x 5 x 5, relu Max. Pooling 2 D, size 2, strides 2 Batch. Normalization Conv 2 D, 128 x 5 x 5, relu Max. Pooling 2 D, size 2, strides 2 Batch. Normalization Conv 2 D, 256 x 3 x 3, relu Max. Pooling 2 D, size 2, strides 2 Batch. Normalization Flatten Dropout, 0. 4 Dense, 1024, relu Dropout, 0. 4 Dense, 10, softmax

CNN Models from Literature Common Nightingale

Summary of Results/Observations Ranking Model Accuracy 1 Random Forest Classifier ~65% 2 CNN Large Scale Classification Model 1 ~48% 3 CNN Urban Sound Model ~30% 4 CNN MNIST Model ~25% 5 CNN Cube. Run Model ~24% 6 CNN Large Scale Classification Model 3 ~23% 7 CNN Large Scale Classification Model 2 ~10% Northern Raven

Further work ● ● ● Train CNNs with other data features Implement CNN models from literature specific to bird audio data ○ Improve accuracy on those Improve Matlab feature extraction and get to training models Tawny Owl

References ● ● ● https: //www. xeno-canto. org/ https: //github. com/ntivirikin/xeno-canto-py Kevin Chng (2020). Classify Urban Sound using Machine Learning & Deep Learning (https: //www. github. com/Kevin. Chng. JY/classifyurbansound_matlab), Git. Hub. Retrieved June 2, 2020. ○ ● ● ● https: //www. mathworks. com/matlabcentral/fileexchange/73857 -classify-urban-sound-usingmachine-learning-deep-learning From Literature Survey https: //besjournals. onlinelibrary. wiley. com/doi/full/10. 1111/2041 -210 X. 13103 https: //schlieplab. org/Static/Publications/2017 -John. Martinsson-Bird. Songs. pdf http: //ceur-ws. org/Vol-1866/paper_143. pdf

Presentation References ● ● ● ● ● Common Blackbird: By David Friel from Telford, England - Young Blackbird Uploaded by Snowmanradio, CC BY 2. 0, https: //commons. wikimedia. org/w/index. php? curid=6978796 European Robin: By © Francis C. Franklin / CC-BY-SA-3. 0, CC BY-SA 3. 0, https: //commons. wikimedia. org/w/index. php? curid=31367900 Eurasian Wren: By Andreas Trepte - Own work, CC BY-SA 2. 5, https: //commons. wikimedia. org/w/index. php? curid=22796287 House Sparrow: By Adamo, CC BY 2. 0 de, https: //commons. wikimedia. org/w/index. php? curid=15188318 Great Spotted Woodpecker: By Hangsna - Own work, CC BY-SA 4. 0, https: //commons. wikimedia. org/w/index. php? curid=46082936 Eurasian Skylark: By Daniel Pettersson - Picture taken by Daniel Pettersson. Uploaded to commons by oskila with his permission. File taken from http: //www. fagelfoto. se, CC BY-SA 2. 5 se, https: //commons. wikimedia. org/w/index. php? curid=1722926 Tawny Owl: By Martin Mecnarowski (http: //www. photomecan. eu/) - Own work, CC BY-SA 3. 0, https: //commons. wikimedia. org/w/index. php? curid=12691242 Common Nightingale: By Carlos Delgado - Own work, CC BY-SA 4. 0, https: //commons. wikimedia. org/w/index. php? curid=40703551 Northern Raven: By T. Müller - Own work, CC BY-SA 3. 0, https: //commons. wikimedia. org/w/index. php? curid=2792927

Code Run-down snippets

THANK YOU
- Slides: 20