DOST Dataset Downtown Osaka Scene Text Dataset Masakazu
- Slides: 29
DOST Dataset Downtown Osaka Scene Text Dataset Masakazu Iwamura, Takahiro Matsuda Naoyuki Morimoto, Hitomi Sato Yuki Ikeda and Koichi Kise Osaka Prefecture University
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
AB an B M g ( YY is 20 hr 1 W a (2 1) N an 01 ov g 2 ik (2 ) ov 01 a 2 G (20 ) Bi oe 1 ss l ( 2) ac 20 Al co 13 sh (2 ) Al ari 01 m f ( 3) az 20 an 14 ) Ja Ya (20 de o 14 rb (2 ) er 01 g 4) ( 2 R od Su 01 4) rg ( ue 20 z 1 G -S 4) Ja ord err de o an Ja rbe (20 o (2 de rg 15 01 rb (2 ) 5) er 01 g 5 (2 ) Sh 01 i ( 5) Po S 20 zn hi 15 an (2 ) sk 01 i ( 6) 20 16 ) W Recent Improvement of Scene Text Recognition IIIT 5 K 50 SVT None 100 90 80 70 60 50 40 30 20 10 0 IIIT 5 K 1 k ICDAR 2003 50 IIIT 5 K None ICDAR 2003 Full SVT 50 ICDAR 2003 50 k Recent results are 80+% or even 90+% This does not mean these methods can read a wide variety of text in the real environment
“Scene Text in the Wild” Text in Real Environment • We mean • Text captured without intention (as much as possible) • Text not screened so as to be easily read (with regard to resolution, capture angle and so on)
We present DOST Dataset
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
Unique Features of DOST Dataset 1. Aim: evaluation of methods in the real environment • Not aiming at training classifiers like MJSynth and Synth. Text datasets 2. Completely not intentionally captured • The most similar is ICDAR 2015 Challenge 4 “incidental scene text” dataset captured with Google Glass • DOST is even free from face direction
Unique Features of DOST Dataset 3. Video dataset captured with omnidirectional camera • ICDAR 2013 & 2015 Challenge 3: single direction • You. Tube Video (YTV) Dataset: You. Tube Videos 4. Contains multiple images of single word
Unique Features of DOST Dataset 5. Large scale • Contains largest number of word Images • Excluding synthesized datasets (MJSynth and Synth. Text) • Excluding dataset containing numbers only (Google Streetview House Number dataset)
No. of Images Contained Image DB in Existing Datasets Video DB ICDAR 2003 ICDAR 2013 Chal. 2 ICDAR 2015 Chal. 4 NEOCR KAIST SVT IIIT 5 K COCO-Text ICDAR 2013 Chal. 3 ICDAR 2015 Chal. 3 YVT DOST 0 10, 00020, 00030, 00040, 00050, 00060, 00070, 000 509 462 1, 670 659 3, 000 349 5, 000 63, 686 15, 277 27, 824 11, 791 32, 147 Almost double
No. of Word Images Contained in Existing Datasets Image DB Video DB 0 100, 000 200, 000 300, 000 400, 000 500, 000 600, 000 700, 000 800, 000 ICDAR 2003 2, 268 ICDAR 2013 Chal. 2 2, 524 Images were captured ICDAR 2015 Chal. 4 17, 548 in shopping streets NEOCR 5, 238 where a lot of texts exist KAIST 3, 000 SVT 904 IIIT 5 K 5, 000 x 4. 6 COCO-Text 173, 589 ICDAR 2013 Chal. 3 93, 598 ICDAR 2015 Chal. 3 125, 141 YVT 16, 620 797, 919 DOST
No. of Word Sequences in Existing Video Datasets 0 ICDAR 2013 Chal. 3 5, 000 10, 000 15, 000 20, 000 25, 000 1, 962 x 6. 3 ICDAR 2015 Chal. 3 YVT DOST 3, 562 245 22, 398
Unique Features of DOST Dataset 6. Contains Japanese characters • On the other hand, a lot of non-Japanese words are contained
No. of Ground Truthed Characters per Category 0 200, 000 400, 000 Alphabet 800, 000 837, 489 Kanji 723, 805 Katakana 696, 697 Hiragana 355, 158 Digit Symbol 600, 000 324, 742 22, 802 Japanese characters
No. of Ground Truthed Characters per Category 0 Alphabet Kanji Katakana Hiragana Digit Symbol 200, 000 400, 000 600, 000 日本店円大 中四業房会北 月千元年間販売酒家取台止 あいうえおかきくけこ さしすせそたちつてと 800, 000 837, 489 723, 805 696, 697 アイウエオカキクケコ サ シ ス セ ソ タ 355, 158 チツテト 324, 742 ~!#&()*,-./: 22, 802 ?×’↑→★、。々〇」・ Japanese characters
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
Construction of DOST Dataset 1. Image capture Completed in 2012 • Point Grey Research Lady. Bug 3 • 1, 200 x 1, 600 pixels, 6. 5 fps
Construction of DOST Dataset 2. Manual ground truthing We spent more than 1, 500 man hours • Most of GT policies are shared with ICDAR 2013 & 2015 Challenge 3 datasets • GT software was developed • Reuse GT information in neighboring frames 3. Privacy preservation • Faces were blurred
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
We will improve them Known Issues • Ground truths are not perfect • Bounding boxes of text regions are not tight enough • Ground trothing “Don’t care” is not comprehensive “Don’t care” is marked in illegible regions • Some word sequences are broken • Relationship between other cameras • Word images in other cameras are not followed
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
Evaluation: Methods • Text detection • Open. CV API • Matsuda’s method based on NAT method • End-to-end text recognition • Google Vision API
Evaluation: Datasets • Image datasets • • • ICDAR 2003 ICDAR 2013 Chal. 2 ICDAR 2015 Chal. 4 SVT COCO-Text • Video datasets • • ICDAR 2015 Chal. 3 YVT DOST Latin Subset of DOST which contain words consisting of alphabets and digits Data were sampled
Text Detection by Open. CV API F-measure [%] Image DB Video DB ICDAR 2003 ICDAR 2013 Chal. 2 ICDAR 2015 Chal. 4 SVT COCO-Text ICDAR 2015 Chal. 3 YVT DOST Latin 0 5 10 15 20 25 30 18. 7 6. 1 13 19 11. 9 8. 5 2. 4 1. 2
End-to-end Text Recognition by Google Vision API F-measure [%] Image DB Video DB ICDAR 2003 ICDAR 2013 Chal. 2 ICDAR 2015 Chal. 4 SVT COCO-Text ICDAR 2015 Chal. 3 YVT DOST Latin 0 20 40 60 80 81. 8 71. 3 48. 5 24. 2 17. 1 44. 1 37. 7 2. 7 11. 2 Recognized in Japanese mode 100
Agenda 1. 2. 3. 4. 5. 6. Introduction Unique Features of DOST Dataset Construction of DOST Dataset Known Issues Evaluation Conclusion
Conclusion • DOST dataset is presented • Has unique features • More challenging than existing datasets
Thank you for your attention!!
- Text to text text to self text to world
- Tadao ando osaka
- Oussep
- Osaka gas singapore
- Osaka university computer science
- Spalarnia śmieci wiedeń
- Seino logix
- Osaka jogakuin
- Mega-ix
- Osaka university library
- Osaka menu boone nc
- Mother dear may i go downtown
- Downtown brooklyn gentrification
- Abb zaragoza
- Why do services cluster downtown
- Downtown elementary school
- Downtown community health clinic
- Seattle parking app
- Downtown stockton revitalization
- Downtown bagels & joe menu
- Coffee klatch is an espresso stand in a downtown
- Key issue 1: why do services cluster downtown?
- Tamaras downtown
- Monitorowanie dost�pno�ci stron
- Monitorowanie dost�pno�ci stron www
- How old is hamlet
- Icpme
- Dostępnościomierz
- Monitorowanie dost�pno�ci serwis�w
- Ahretlik nasıl olunur