Advanced Computer Vision Devi Parikh Electrical and Computer

  • Slides: 55
Download presentation
Advanced Computer Vision Devi Parikh Electrical and Computer Engineering

Advanced Computer Vision Devi Parikh Electrical and Computer Engineering

Plan for today • Topic overview • Introductions • Course overview: – Logistics –

Plan for today • Topic overview • Introductions • Course overview: – Logistics – Requirements • Placing this course in context of others – Plan for next lecture • Please interrupt at any time with questions or comments

Computer Vision • Automatic understanding of images and video – Computing properties of the

Computer Vision • Automatic understanding of images and video – Computing properties of the 3 D world from visual data (measurement) – Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) – Algorithms to mine, search, and interact with visual data (search and organization) Kristen Grauman

What does recognition involve? Fei-Fei Li

What does recognition involve? Fei-Fei Li

Detection: are there people?

Detection: are there people?

Activity: What are they doing?

Activity: What are they doing?

Object categorization mountain tree building banner street lamp vendor people

Object categorization mountain tree building banner street lamp vendor people

Instance recognition Potala Palace A particular sign

Instance recognition Potala Palace A particular sign

Scene and context categorization • outdoor • city • …

Scene and context categorization • outdoor • city • …

Attribute recognition gray made of fabric crowded flat

Attribute recognition gray made of fabric crowded flat

Why recognition? • Recognition a fundamental part of perception – e. g. , robots,

Why recognition? • Recognition a fundamental part of perception – e. g. , robots, autonomous agents • Organize and give access to visual content – Connect to information – Detect trends and themes • Where are we now? Kristen Grauman

We’ve come a long way…

We’ve come a long way…

We’ve come a long way…

We’ve come a long way…

We’ve come a long way…

We’ve come a long way…

Posing visual queries Yeh et al. , MIT Belhumeur et al. Kristen Grauman Kooaba,

Posing visual queries Yeh et al. , MIT Belhumeur et al. Kristen Grauman Kooaba, Bay & Quack et al.

Exploring community photo collections Snavely et al. Kristen Grauman Simon & Seitz

Exploring community photo collections Snavely et al. Kristen Grauman Simon & Seitz

Dubrovnik Auto. Tagger: Yunpeng Li, Noah Snavely, Dan Huttenlocher and Pascal Fua 17 Slide

Dubrovnik Auto. Tagger: Yunpeng Li, Noah Snavely, Dan Huttenlocher and Pascal Fua 17 Slide credit: Devi Parikh

Autonomous agents able to detect objects Kristen Grauman http: //www. darpa. mil/grandchallenge/gallery. asp

Autonomous agents able to detect objects Kristen Grauman http: //www. darpa. mil/grandchallenge/gallery. asp

We’ve come a long way… Fischler and Elschlager, 1973

We’ve come a long way… Fischler and Elschlager, 1973

We’ve come a long way…

We’ve come a long way…

Challenges

Challenges

Challenges: robustness Illumination Occlusions Kristen Grauman Object pose Intra-class appearance Clutter Viewpoint

Challenges: robustness Illumination Occlusions Kristen Grauman Object pose Intra-class appearance Clutter Viewpoint

Challenges: context and human experience Context cues Kristen Grauman

Challenges: context and human experience Context cues Kristen Grauman

Challenges: context and human experience Context cues Kristen Grauman Function Dynamics Video credit: J.

Challenges: context and human experience Context cues Kristen Grauman Function Dynamics Video credit: J. Davis

Challenges: scale, efficiency • Half of the cerebral cortex in primates is devoted to

Challenges: scale, efficiency • Half of the cerebral cortex in primates is devoted to processing visual information • ~20 hours of video added to You. Tube per minute • ~5, 000 new tagged photos added to Flickr per minute • Thousands to millions of pixels in an image • 30+ degrees of freedom in the pose of articulated objects (humans) • 3, 000 -30, 000 human recognizable object categories Kristen Grauman

Challenges: learning with minimal supervision More Less Un mu lab lti ele pl d,

Challenges: learning with minimal supervision More Less Un mu lab lti ele pl d, e ob je c ts Kristen Grauman Cl so asse me s clu labe tte led , r Cr pa opp lab rts a ed t ele nd o o d cla bje sse ct, s

Slide from Pietro Perona, 2004 Object Recognition workshop

Slide from Pietro Perona, 2004 Object Recognition workshop

Slide from Pietro Perona, 2004 Object Recognition workshop

Slide from Pietro Perona, 2004 Object Recognition workshop

Inputs in 1963… L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph. D.

Inputs in 1963… L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph. D. thesis, MIT Department of Electrical Engineering, 1963. Kristen Grauman

… and inputs today Personal photo albums Surveillance and security Movies, news, sports Medical

… and inputs today Personal photo albums Surveillance and security Movies, news, sports Medical and scientific images Slide credit; L. Lazebnik

… and inputs today 350 mil. photos, 1 mil. added daily 1. 6 bil.

… and inputs today 350 mil. photos, 1 mil. added daily 1. 6 bil. images indexed as of summer 2005 916, 271 titles 10 mil. videos, 65, 000 added daily Understand organize and index all this data!! Images on the Web Satellite imagery Movies, news, sports City streets Slide credit; L. Lazebnik

Introductions • • Devi Parikh Ph. D. , Carnegie Mellon University, 2009 Research Assistant

Introductions • • Devi Parikh Ph. D. , Carnegie Mellon University, 2009 Research Assistant Professor, TTI-Chicago, 2013 Assistant Professor, ECE, Virginia Tech

Introductions • • • Which program are you in? How far along? Have you

Introductions • • • Which program are you in? How far along? Have you taken a computer vision course before? Have you taken a machine learning course before? What are you hoping to get out of this class?

This course • ECE 6554 • TR 5: 00 pm to 6: 15 pm

This course • ECE 6554 • TR 5: 00 pm to 6: 15 pm • Lavery Hall Room 345 • Course webpage: https: //filebox. ece. vt. edu/~S 16 ECE 6554/

This course • Focus on more advanced techniques and ideas in computer vision •

This course • Focus on more advanced techniques and ideas in computer vision • Presented in research papers • High-level recognition problems, innovative applications.

Goals • • Understand state-of-the-art approaches Analyze and critique current approaches Identify interesting open

Goals • • Understand state-of-the-art approaches Analyze and critique current approaches Identify interesting open questions Present clearly and methodically

Official Learning Objectives • Describe state-of-the-art approaches in object recognition and scene understanding •

Official Learning Objectives • Describe state-of-the-art approaches in object recognition and scene understanding • Discuss tools from other fields (e. g. , machine learning) that are frequently used to solve computer vision problems • Implement two approaches to address important problems in computer vision • Discuss and critique research papers in computer vision • Identify open research questions in computer vision

Expectations • Paper reviews each class [25%] • Leading discussion (~ twice) on papers

Expectations • Paper reviews each class [25%] • Leading discussion (~ twice) on papers [15%] • Presentations (~ once) [25%] – Present topic – Papers and background reading • Project [35%] No “Assignments”, Exams, etc.

Prerequisites • Course in computer vision • Courses in machine learning is a plus

Prerequisites • Course in computer vision • Courses in machine learning is a plus

Paper reviews • For each class – Review one paper • Email me reviews

Paper reviews • For each class – Review one paper • Email me reviews by noon (12: 00 pm) the day of the class – firstname_lastname_MM_DD. pdf – I will grade a random subset in detail • Skip reviews the classes you are presenting or leading discussion • Late reviews will not be accepted • Will drop three lowest grades on reviews

Paper review guidelines • One page • Detailed review: – – – – Brief

Paper review guidelines • One page • Detailed review: – – – – Brief (2 -3 sentences) summary Main contribution Strengths? Weaknesses? How convincing are the experiments? Suggestions to improve them? Extensions? Applications? Additional comments, unclear points Relationships observed between the papers we are reading • Most interesting thought • Write in your own words • Write well, proof read

Leading Discussion • ~ One of you will be assigned to argue for the

Leading Discussion • ~ One of you will be assigned to argue for the paper • ~ One of you will be assigned to argue against the paper • Come prepared with 5 points

Presentation guidelines • IMPORTANT: Don’t present papers – present the topic! • Do a

Presentation guidelines • IMPORTANT: Don’t present papers – present the topic! • Do a lit review and look at background papers (e. g. “seeds / pointers for presenters”), and also more recent work. • Well-organized talk, 30 minutes

Presentation guidelines • What to cover? – Topic overview, motivation – One or two

Presentation guidelines • What to cover? – Topic overview, motivation – One or two papers in details • • • Problem overview, motivation Algorithm explanation, technical details Experimental set up, results Strengths, weaknesses, extensions NOT the paper the class has read. – Any commonalities, important differences between techniques covered in the papers. • A demo / experiment would be great • See class webpage for more details.

Experiments • Implement/download code for a main idea in the paper and evaluate it:

Experiments • Implement/download code for a main idea in the paper and evaluate it: – Experiment with different types of training/testing data sets – Evaluate sensitivity to important parameter settings – Show an example to analyze a strength/weakness of the approach – Show qualitative and quantitative results

Tips • Look up papers and authors. Their webpage may have data, code, slides,

Tips • Look up papers and authors. Their webpage may have data, code, slides, videos, etc. – Make sure talk flows well and makes sense as a whole. – Cite ALL sources. • Don’t forget the high-level picture. • Give a very clear and well-organized and thought out talk. • Will interrupt if something is not clear

Tips • Make sure you are saying everything we need to know to understand

Tips • Make sure you are saying everything we need to know to understand what you are saying. • Make sure you know what you are talking about. • Think about your audience. • Make your talks visual, animated (images, video, not lots of text).

Projects Possibilities: – Extension of a technique studied in class – Analysis and empirical

Projects Possibilities: – Extension of a technique studied in class – Analysis and empirical evaluation of an existing technique – Comparison between two approaches – Design and evaluate a novel approach – Be creative! Can work with a partner

Project timeline • Project proposals (1 page) [25%] – March 1 st • Final

Project timeline • Project proposals (1 page) [25%] – March 1 st • Final presentations [40%] – April 19 th to 26 th • Project video (1 minute) [35%] – April 28 thth

Implementation • Use any language / platform you like • No support for code

Implementation • Use any language / platform you like • No support for code / implementation issues will be provided

Miscellaneous • Best presentation, best project and best discussion prizes! – We will vote

Miscellaneous • Best presentation, best project and best discussion prizes! – We will vote – Dinner • Feedback welcome and useful

Coming up • Read the class webpage – Schedule is up – Tentative, especially

Coming up • Read the class webpage – Schedule is up – Tentative, especially because of likely snow days • Select 6 dates (topics) you would like to present – Sign up sheet shows how many people have already signed up for a topic – Select those that have fewer selections – “Bonus” for presenters next week. – Probability of dropping class? • I will send pointers to good presentations, reviews, etc. Already on class webpage.

Context • Intro to computer vision – Intro to machine learning • Course last

Context • Intro to computer vision – Intro to machine learning • Course last semester on Deep Learning for Perception: – https: //computing. ece. vt. edu/~f 15 ece 6504/ – This course is complementary to it • Ram’s presentation on Thursday – Individual presenters will touch on state-of-the -art in each

Each Lecture • ~ 20 minute discussion on paper we read – Led by

Each Lecture • ~ 20 minute discussion on paper we read – Led by two students: “for” and “against” • ~ 30 minute presentation on topic • ~ 25 minutes for questions, interruptions, unplanned discussions

Questions? See you Thursday!

Questions? See you Thursday!