AFP System Flowchart 132 Landmark Identification 232 Hash

  • Slides: 13
Download presentation
AFP System Flowchart 1/32

AFP System Flowchart 1/32

Landmark Identification 2/32

Landmark Identification 2/32

Hash Table Construction Computation of hash key and hash value 3/32

Hash Table Construction Computation of hash key and hash value 3/32

Retrieval Process Convert query landmarks to hash key Retrieve the hash values Derive song

Retrieval Process Convert query landmarks to hash key Retrieve the hash values Derive song ID and landmark start time Find no. of time-consistent landmarks match landmark count Use MLC for final ranking 4/32

Offset Time 歌曲編號 偏移時間 雜湊鍵 444 17 54372 … … … 496 158 17936

Offset Time 歌曲編號 偏移時間 雜湊鍵 444 17 54372 … … … 496 158 17936 … … … 5/32

Match Frequency Count 1 (MFC 1) 6/32

Match Frequency Count 1 (MFC 1) 6/32

Match Frequency Count 2 ( MFC 2 ) MFC 2: Almost the same as

Match Frequency Count 2 ( MFC 2 ) MFC 2: Almost the same as MFC 1, with less restriction Restriction: Within the right interval only 7/32

Comparison between MFC 1 & MFC 2 MFC 1 MFC 2 • Within the

Comparison between MFC 1 & MFC 2 MFC 1 MFC 2 • Within the right interval • Same hash key as the query landmarks • No need to store extra info • Within the right interval • May have different hash key as the query landmarks • Need to store extra info • More discriminant 8/32

Learning to Rank Use machine for ranking, with three paradigms Pointwise approach A is

Learning to Rank Use machine for ranking, with three paradigms Pointwise approach A is right and B is wrong Ex. PRank Pairwise approach A>B, C>D, A>D, etc Ex. Ranking. SVM Listwise approach A>B>C>D… Ex. List. Net 9/32

Experimental Settings OS Windows 7 Enterprise, 64 -bit RAM 8 GB Main Memory CPU

Experimental Settings OS Windows 7 Enterprise, 64 -bit RAM 8 GB Main Memory CPU Intel® Core™ i 7 -4770 ( 3. 40 GHz ) Programming language MATLAB 10/32

Corpora for the Experiments Datasets Query sets Baina George Size 500首 10000 File duration

Corpora for the Experiments Datasets Query sets Baina George Size 500首 10000 File duration 3 -10 minutes 30 sec to 10 minutes Total duration 38 hours 22 minutes 636 hours 41 minutes Languages Mandarin and English 952 from GTZAN dataset plus other 9048 noisy mp 3, in English and Mandarin Audio format Mono/stereo, mp 3/wav, 44. 1 KHz, 16 bits Size 1412 1062 Query duration About 10 sec Total duration 3 hours 55 minutes 2 hours 57 minutes Source Recordings of 5 clips at very noisy environment, and chop them into 1042 10 -sec segments (with 9 -sec overlap) Recordings of 345 clips at noisy environment, and chop them into 1062 10 sec segments (without overlap) 11/32

Experimental Results Using Baina Dataset Re-ranking is invoked when the diff of MLCs of

Experimental Results Using Baina Dataset Re-ranking is invoked when the diff of MLCs of top-2 candidates is larger than 15 Only re-rank the top-10 candidates Methods Accuracy (%) Original 86. 83 MFC 1 89. 02 16. 63 MFC 2 91. 78 37. 59 Ranking SVM 91. 997 39. 23 List. Net 91. 997 39. 23 Example recording Error reduction rate (%) 12/32

Experimental Results Using George Dataset Use the same condition for re-ranking 方法 辨識率 (%)

Experimental Results Using George Dataset Use the same condition for re-ranking 方法 辨識率 (%) 錯誤降低率 (%) Original 83. 52 MFCI 84. 46 5. 71 MFCII 85. 78 13. 71 Ranking SVM 85. 88 14. 29 List. Net 85. 78 13. 71 13/32