Visipedia Visual Recognition with Humans in the Loop

- Slides: 1
Visipedia: Visual Recognition with Humans in the Loop Steve Branson 1 Catherine Wah 1 Boris Babenko 1 Florian Schroff 1 Peter Welinder 2 Pietro Perona 2 Serge Belongie 1 Science and Engineering University of California, San Diego 2 Electrical Engineering California Institute of Technology {sbranson, cwah, bbabenko, gschroff, sjb}@cs. ucsd. edu {welinder, perona}@caltech. edu 1 Computer Modeling User Responses Interactive Object Recognition Abstract We introduce Visipedia, a user-generated Encyclopedia of visual knowledge that is intended to enrich the content of Wikipedia. Visual data is the predominant sensory input through which people observe the world, people are visual learners, and visual images are fundamentally important toward the ways in which people encode knowledge and perceive the world. Unfortunately, the organization of visual content on the web is still very impoverished. This is in large part due to the raw size and complexity of images and the non-existence of scalable computer vision algorithms capable of automatically recognizing or organizing images on a semantic level. The shortcomings of computer vision algorithms can in part be explained by a shortage in the quantity and quality of the labeled visual images necessary for training machine learning algorithms. We propose a collaborative effort between computers and humans toward the development of Visipedia, where the initial user-generated population of Visipedia will help train machine learning algorithms, which will in turn help automate the process of building Visipedia. Toward this aim, we propose new paradigms for interactive algorithms combining computer vision with user-input, richer representations for representing visual objects than are traditionally studied in computer vision, and earning algorithms that are more scalable to Internet-scale recognition. Caltech Birds-200 Dataset MTurker Label Certainty • Image harvesting: text search of species name on Flickr User Responses are Stochastic • Data cleaning: identifying bird presence/absence with Amazon Mechanical Turk (“a marketplace for work that requires human intelligence” [http: //www. mturk. com]) What is Visipedia? • Visual counterpart to Wikipedia • User-generated encyclopedia of visual knowledge • An effort to associate Wikipedia articles with large quantities of well-organized, intuitive visual concepts • A paradigm for combining computer vision and machine learning with human annotation Rose-breasted Grosbeak Attribute-based Classification Q: Is the belly red? yes (Def) Q: Is the breast black? yes (Def. ) Q : Is the primary color red? yes (Def. ) • Visual attributes from http: //www. whatbird. com - Attribute classification tasks might be easier - Easier to incorporate human knowledge Adding Computer Vision Helps • Attribute labeling: MTurk interface • Computer vision reduces manual labor • Computer vision improves performance • Different questions are asked with and without computer vision Motivation Western Grebe Rose-breasted Grosbeak Yellow-headed Blackbird Only CV CV + Q #1: Is the crown black? yes (Def. ) w/ vision: Q #1: Is the throat white? yes (Def. ) w/o vision: Q #1: Is the shape perching-like? no (Def. ) Rosebreasted Grosbeak Recognition is not Always Successful Need for more training data Need for more realistic data Visual 20 -Questions Game Parakeet Auklet Least Auklet Sayornis Gray Kingbird Indigo Bunting Blue Grosbeak • Choose question to maximize expected information gain Q : Is the belly multicolored? yes (Def. ) Dealing with Many Related Classes MTurker Feedback (A) Easy for Humans (B) Hard for Humans (C) Easy for Humans Computer Vision Chair? Airplane? … Finch? Bunting? … Yellow Belly? Blue Belly? … Input Image ( ) Question 1: Is the belly black? Question 2: Is the bill hooked? A: NO A: YES “These hits were fun. Will you be posting more of them anytime soon? Thanks!” “These are Beautiful birds and I am enjoying this hit collection” “I really enjoy doing your hits, they are fun and interesting. Thanks. ” “Love doing these because I'm a bird watcher. ” “the birds are so cute. . hope u can send more kind of birds” “I REALLY LOVE THE COLOR OF THE BIRDS. ” “Thank you for providing this job. The fact that the images are beautiful to look at make it a lot more enjoyable to do!” • Hourly Wage ≈ $1. 25 • •