Realtime Recognition of Whale Calls using Sound ID





















- Slides: 21
Real-time Recognition of Whale Calls using Sound. ID Neil J Boucher, Sound. ID. Australia Michihiro Jinnai, Nagoya Women’s University. Japan
Who Are We? Sound. ID www. soundid. net Founded 2002 Engineers specialising in sound recognition Aim to produce human expert and better quality recognition Designed to run in real-time Designed to handle terabytes of data with ease Have our own patented methods
Sound. ID Mk I in glorious 1 -d!
1 -d Recognition Results monitoring a Dawn Chorus over 70 minutes
Birds are Easy, Whales are Hard The 1 -d method works well for most bird calls as just shown. Frogs and bats also easy. Whales are harder because the 1/f noise and other background noises “smear” the 1 -d image. Worse still, most of the whale noises are in-band.
2 -d Depiction of Same Call
Two Methods Compared
The “Bottom of the Coke Bottle” or “FFT-like” View of the Same Call at Frame-Width 513 Points
Geometric Distance Euclidean Distance is the simple straight line (first order) distance between two points. Geometric Distance (measured in angular degrees) is the angle between two vectors that we use to describe the difference in shape. It is a fourth order measure, vaguely related to the skew of a standard distribution.
The Measure of Geometric Distance As a measure of similarity we can use the GD between two n-dimensional shapes to measure their similarity. Two identical shapes have a GD = zero Two totally dissimilar shapes have a GD of 90 degrees. The spectra of two sounds that sound “similar” to the human ear can have a GD in the range 0 to about 6 degrees.
Value of Geometric Distance Once we have a library of reference sounds we can compare that library with the detected target sound. Each sound in the library is compared in turn with the target and the one with the lowest GD is declared the closest match. Then, if the lowest GD is small enough to be declared a match (typically GD<6 degrees) the target has been identified!
More is Better Notice now that the bigger the reference library the better the chance of having a matching reference and the greater the accuracy. Notice also that the reference library is not restricted to sounds that are in some way related any group of sounds can comprise part of the library.
Comparison of Two Adjacent Calls. Note no Filter used or Needed!
Same Calls Zoomed In for a Closer Look. Note the smaller GD Freq x 3. 3
A Cluster Analysis Based on GD = 6 degrees. Note 6 clusters!
Two Sweeps that are not Clustered (672 and 438 seconds) 672 x 3. 3 f 438 x 3. 3 f
Analysis of 524 Gunshot Sounds from File 20080126 Running a cluster of 526 Gunshots each with each other results in 275, 625 comparisons being made. Sorting these into GD = 6 Clusters, we find they fall into 64 groups. A few are unique with only one member while large clusters of 120 or more also found.
The Cluster Analysis is Automated
An Example of Two Very Different Gunshots Identified with the Cluster Analysis 23450 -863375 161500 -58796
Progress to Date The 2 -d Spectrum recognition module is complete and tested. It works well and can match a human expert for accuracy. The speed of recognition has been found to be fast enough for real-time recognition. The detection module is under development. Release date is late 2013.
Find Us At did. net www. so