Visual Recognition With Humans in the Loop ECCV


















































- Slides: 50
Visual Recognition With Humans in the Loop ECCV 2010, Crete, Greece Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona 1
What type of bird is this? 2
Field Guide What type of bird is this? …? 3
Computer Vision ? What type of bird is this? 4
Computer Vision Bird? What type of bird is this? 5
Computer Vision What type of bird is this? Chair? Bottle? 6
• Field guides difficult for average users • Computer vision doesn’t work perfectly (yet) • Research mostly on basiclevel categories Parakeet Auklet 7
Visual Recognition With Humans in the Loop What kind of bird is this? Parakeet Auklet 8
Levels of Categorization Basic-Level Categories Airplane? Chair? Bottle? … [Griffin et al. ‘ 07, Lazebnik et al. ‘ 06, Grauman et al. ‘ 06, Everingham et al. ‘ 06, Felzenzwalb et al. ‘ 08, Viola et al. ‘ 01, … ] 9
Levels of Categorization Subordinate Categories American Goldfinch? Indigo Bunting? … [Belhumeur et al. ‘ 08 , Nilsback et al. ’ 08, …] 10
Levels of Categorization Parts and Attributes Yellow Belly? Blue Belly? … [Farhadi et al. ‘ 09, Lampert et al. ’ 09, Kumar et al. ‘ 09] 11
Visual 20 Questions Game Blue Belly? no Cone-shaped Beak? yes Striped Wing? yes American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones 12
Recognition With Humans in the Loop Computer Vision Cone-shaped Beak? yes Computer Vision American Goldfinch? yes • Computers: reduce number of required questions • Humans: drive up accuracy of vision algorithms 13
Research Agenda 2010 Heavy Reliance on Human Assistance Blue belly? no Cone-shaped beak? yes Striped Wing? yes American Goldfinch? yes 2015 More Automated Computer Vision Improves Striped Wing? yes American Goldfinch? yes Fully Automatic American Goldfinch? yes 2025 14
Field Guides www. whatbird. com 15
Field Guides www. whatbird. com 16
Example Questions 17
Example Questions 18
Example Questions 19
Example Questions 20
Example Questions 21
Example Questions 22
Basic Algorithm Computer Vision Input Image ( ) Max Expected Information Gain Question 1: Is the belly black? A: NO Max Expected Information Gain Question 2: Is the bill hooked? … A: YES 23
Without Computer Vision Class Prior Input Image ( ) Max Expected Information Gain Question 1: Is the belly black? A: NO Max Expected Information Gain Question 2: Is the bill hooked? … A: YES 24
Basic Algorithm Select the next question that maximizes expected information gain: • Easy to compute if we can to estimate probabilities of the form: Object Class Image Sequence of user responses 25
Basic Algorithm Model of user responses Computer Normalization factor vision estimate 26
Basic Algorithm Model of user responses Computer Normalization factor vision estimate 27
Modeling User Responses • Assume: • Estimate using Mechanical Turk Guessing grey red black white brown blue Pine Grosbeak Probably grey red black white brown blue Definitely What is the color of the belly? 28
Incorporating Computer Vision • Use any recognition algorithm that can estimate: p(c|x) • We experimented with two simple methods: 1 -vs-all SVM Attribute-based classification [Lampert et al. ’ 09, Farhadi et al. ‘ 09] 29
Incorporating Computer Vision • Used VLFeat and MKL code + color features Geometric Blur Self Similarity Color SIFT, SIFT Color Histograms Color Layout Multiple Kernels Bag of Words Spatial Pyramid [Vedaldi et al. ’ 08, Vedaldi et al. ’ 09] 30
Birds 200 Dataset • • 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Arctic Tern Groove-Billed Ani Forster’s Tern Parakeet Auklet Field Sparrow Common Tern Vesper Sparrow 31 Baird’s Sparrow Henslow’s Sparrow
Birds 200 Dataset • • 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Arctic Tern Groove-Billed Ani Forster’s Tern Parakeet Auklet Field Sparrow Common Tern Vesper Sparrow 32 Baird’s Sparrow Henslow’s Sparrow
Birds 200 Dataset • • 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Arctic Tern Groove-Billed Ani Forster’s Tern Parakeet Auklet Field Sparrow Common Tern Vesper Sparrow 33 Baird’s Sparrow Henslow’s Sparrow
Results: Without Computer Vision Comparing Different User Models 34
Results: Without Computer Vision Perfect Users: 100% accuracy in 8≈log 2(200) questions if users answers agree with field guides… 35
Results: Without Computer Vision Real users answer questions MTurkers don’t always agree with field guides… 36
Results: Without Computer Vision Real users answer questions MTurkers don’t always agree with field guides… 37
Results: Without Computer Vision Probabilistic User Model: tolerate imperfect user responses 38
Results: With Computer Vision 39
Results: With Computer Vision Users drive performance: 19% 68% Just Computer Vision 19% 40
Results: With Computer Vision Reduces Manual Labor: 11. 1 6. 5 questions 41
Examples Different Questions Asked w/ and w/out Computer Vision Western Grebe Without computer vision: Q #1: Is the shape perching-like? no (Def. ) With computer vision: Q #1: Is the throat white? yes (Def. ) perching-like 42
Examples User Input Helps Correct Computer Vision Magnolia Warbler Common Yellowthroat computer vision Common Yellowthroat Is the breast pattern solid? no (definitely) Magnolia Warbler 43
Recognition is Not Always Successful Acadian Flycatcher Unlimited questions Least Flycatcher Parakeet Auklet Least Auklet Is the belly multicolored? yes (Def. ) 44
Summary 19% Recognition of fine-grained categories Users drive up performance 11. 1 6. 5 questions More reliable than field guides Computer vision reduces manual labor 45
Summary 19% Recognition of fine-grained categories Users drive up performance 11. 1 6. 5 questions More reliable than field guides Computer vision reduces manual labor 46
Summary 19% Recognition of fine-grained categories Users drive up performance 11. 1 6. 5 questions More reliable than field guides Computer vision reduces manual labor 47
Summary 19% Recognition of fine-grained categories Users drive up performance 11. 1 6. 5 questions More reliable than field guides Computer vision reduces manual labor 48
Future Work • Extend to domains other than birds • Methodologies for generating questions • Improve computer vision 49
Questions? Project page and datasets available at: http: //vision. caltech. edu/visipedia/ http: //vision. ucsd. edu/project/visipedia/ 50