UW Computer Vision CSE 576 Staff Steve Seitz

  • Slides: 60
Download presentation
UW Computer Vision (CSE 576) Staff Steve Seitz Rick Szeliski seitz@cs. washington. edu szeliski@microsoft.

UW Computer Vision (CSE 576) Staff Steve Seitz Rick Szeliski seitz@cs. washington. edu szeliski@microsoft. com TA: Jiun-Hung Chen jhchen@cs. washington. edu Web Page • http: //www. cs. washington. edu/education/courses/cse 576/08 sp/

Today • • Introduction Computer vision overview Course overview Image processing Readings • Book:

Today • • Introduction Computer vision overview Course overview Image processing Readings • Book: Richard Szeliski, Computer Vision: Algorithms and Applications – (please check Web site weekly for updated drafts) – Introduction: Chapter 1. 0

1. 1 What Is Computer Vision? • Optical character recognition (OCR): reading handwritten postal

1. 1 What Is Computer Vision? • Optical character recognition (OCR): reading handwritten postal codes on letters (Figure 1. 4 a) and automatic number plate recognition (ANPR); • Machine inspection: rapid parts inspection for quality assurance using stereo vision with specialized illumination to measure tolerances on aircraft wings or auto body parts (Figure 1. 4 b) or looking for defects in steel castings using X-ray vision; • Retail: object recognition for automated checkout lanes (Figure 1. 4 c); • 3 D model building (photogrammetry): fully automated construction of 3 D models from aerial photographs used

1. 1 What Is Computer Vision? • Medical imaging: registering pre-operative and intra- operative

1. 1 What Is Computer Vision? • Medical imaging: registering pre-operative and intra- operative imagery (Figure 1. 4 d) or performing long-term studies of people’s brain morphology as they age; • Automotive safety: detecting unexpected obstacles such as pedestrians on the street, under conditions where active vision techniques such as radar or lidar do not work well (Figure 1. 4 e; see also Miller, Campbell, Huttenlocher et al. (2008); Montemerlo, Becker, Bhat et al. (2008); Urmson, Anhalt, Bagnell et al. (2008) for examples of fully automated driving);

1. 1 What Is Computer Vision? • Match move: merging computer-generated imagery (CGI) with

1. 1 What Is Computer Vision? • Match move: merging computer-generated imagery (CGI) with live action footage by tracking feature points in the source video to estimate the 3 D camera motion and shape of the environment. • Such techniques are widely used in Hollywood (e. g. , in movies such as Jurassic Park) (Roble 1999; Roble and Zafar 2009); • They also require the use of precise matting to insert new elements between foreground and background elements (Chuang, Agarwala, Curless et al. 2002).

1. 1 What Is Computer Vision? • Motion capture (mocap): using retro-reflective markers viewed

1. 1 What Is Computer Vision? • Motion capture (mocap): using retro-reflective markers viewed from multiple cameras or other vision-based techniques to capture actors for computer animation; • Surveillance: monitoring for intruders, analyzing highway traffic (Figure 1. 4 f), and monitoring pools for drowning victims; • Fingerprint recognition and biometrics: for automatic access authentication as well as forensic applications.

What Is Computer Vision? Does human vision (left) or computer vision (right) perform better?

What Is Computer Vision? Does human vision (left) or computer vision (right) perform better? Terminator 2 How can we write a computer program to tell the left side is a real human and right side a robot?

Every Picture Tells a Story Black and white photo by H Roger. Viollet of

Every Picture Tells a Story Black and white photo by H Roger. Viollet of a train accident at La Gare Montparnasse station in Paris on 22 October, 1895 when engine 120 -721 failed to stop at the platform, went through a first-floor window and crashed down onto the street. Goal of computer vision is to write computer programs that can interpret images.

Can Computers Match (or Beat) Human Vision? Yes and no (but mostly no!) •

Can Computers Match (or Beat) Human Vision? Yes and no (but mostly no!) • Humans are much better at “hard” and versatile things. • Computers can be better at “easy, ” specific, and repetitive things.

Human Perception Has Its Shortcomings… Although this image appears to be a fairly run-of-the-mill

Human Perception Has Its Shortcomings… Although this image appears to be a fairly run-of-the-mill picture of Bill Clinton and Al Gore, a closer inspection reveals that both men have been digitally given identical inner face features and their mutual configuration. Only the external features are different. It appears, therefore, that the human visual system makes strong use of the overall head shape in order to determine facial identity. Sinha and Poggio, Nature, 1996

Copyright A. Kitaoka 2003

Copyright A. Kitaoka 2003

Rotating Snakes Circular snakes appear to rotate spontaneously.

Rotating Snakes Circular snakes appear to rotate spontaneously.

Current State of the Art The next slides show some examples of what current

Current State of the Art The next slides show some examples of what current vision systems can do.

Earth Viewers (3 D Modeling) Image from Microsoft’s Virtual Earth (see also: Google Earth)

Earth Viewers (3 D Modeling) Image from Microsoft’s Virtual Earth (see also: Google Earth)

Photosynth http: //photosynth. net/ Based on Photo Tourism technology developed here in CSE! by

Photosynth http: //photosynth. net/ Based on Photo Tourism technology developed here in CSE! by Noah Snavely, Steve Seitz, and Rick Szeliski CSE: Computer Science and Engineering

Optical Character Recognition (OCR) Technology to convert scanned documents to text • If you

Optical Character Recognition (OCR) Technology to convert scanned documents to text • If you have a scanner, it probably came with OCR software Digit Recognition, AT&T Laboratories AT&T: American Telephone and Telegraph http: //www. research. att. com/~yann/ License plate readers http: //en. wikipedia. org/wiki/Automatic_number_plate_recognition

Face Detection Many new digital cameras now detect faces • Canon, Sony, Fuji, …

Face Detection Many new digital cameras now detect faces • Canon, Sony, Fuji, … • Smile shutter • Subject metering for auto focus, auto exposure, and auto white balance

Smile Detection? Sony Cyber-shot® T 70 Digital Still Camera

Smile Detection? Sony Cyber-shot® T 70 Digital Still Camera

Object Recognition (in Supermarkets) Lane. Hawk by Evolution. Robotics “A smart camera is flush-mounted

Object Recognition (in Supermarkets) Lane. Hawk by Evolution. Robotics “A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with Lane. Hawk, you are assured to get paid for it… “

Face Recognition Who is she?

Face Recognition Who is she?

Vision-Based Biometrics “How the Afghan Girl was Identified by Her Iris Patterns” Read the

Vision-Based Biometrics “How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Login without a Password… Fingerprint scanners on many new laptops, other devices Face recognition

Login without a Password… Fingerprint scanners on many new laptops, other devices Face recognition systems now beginning to appear more widely http: //www. sensiblevision. com/

Object Recognition (in Mobile Phones) This is becoming real: • Microsoft Research • Point

Object Recognition (in Mobile Phones) This is becoming real: • Microsoft Research • Point & Find, Nokia

Special Effects: Shape Capture The Matrix movies, ESC Entertainment, XYZ RGB, NRC: National Research

Special Effects: Shape Capture The Matrix movies, ESC Entertainment, XYZ RGB, NRC: National Research Council

Special Effects: Motion Capture Pirates of the Carribean, Industrial Light and Magic Click here

Special Effects: Motion Capture Pirates of the Carribean, Industrial Light and Magic Click here for interactive demo

Sports Sportvision first down line Nice explanation on www. howstuffworks. com • The system

Sports Sportvision first down line Nice explanation on www. howstuffworks. com • The system has to know the orientation of the field with respect to the camera so that it can paint the first-down line with the correct perspective from that camera's point of view. • The system has to know, in that same perspective framework, exactly where every yard line is.

Smart Cars Slide content courtesy of Amnon Shashua Mobileye • • Vision systems currently

Smart Cars Slide content courtesy of Amnon Shashua Mobileye • • Vision systems currently in high-end BMW, GM, Volvo models By 2010: 70% of car manufacturers. Video demo BMW: Bavarian Motor Works GM: General Motors

Vision-Based Interaction (and Games) Digimask: put your face on a 3 D avatar. Nintendo

Vision-Based Interaction (and Games) Digimask: put your face on a 3 D avatar. Nintendo Wii has camera-based IR tracking built in. See Lee’s work at CMU on clever tricks on using it to create a multi-touch display! IR: Infra-Red CMU: Carnegie Mellon University “Game turns moviegoers into Human Joysticks”, CNET Camera tracking a crowd, based on this work.

Vision in Space NASA'S Mars Exploration Rover Spirit captured this westward view from atop

Vision in Space NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasks • • • Panorama stitching 3 D terrain modeling Obstacle detection, position tracking For more, read “Computer Vision on Mars” by Matthies et al. JPL: Jet Propulsion Laboratory

Robotics NASA’s Mars Spirit Rover http: //en. wikipedia. org/wiki/Spirit_rover NASA: National Aeronautics and Space

Robotics NASA’s Mars Spirit Rover http: //en. wikipedia. org/wiki/Spirit_rover NASA: National Aeronautics and Space Administration http: //www. robocup. org/

Medical Imaging 3 D imaging MRI, CT MRI: Magnetic Resonance Imaging CT: Computer Tomography

Medical Imaging 3 D imaging MRI, CT MRI: Magnetic Resonance Imaging CT: Computer Tomography Image guided surgery Grimson et al. , MIT: Massachusetts Institute of Technology

Current State of the Art You just saw examples of current systems. • Many

Current State of the Art You just saw examples of current systems. • Many of these are less than 5 years old. This is a very active research area, and rapidly changing. • Many new applications in the next 5 years To learn more about vision applications and companies. • David Lowe maintains an excellent overview of vision companies – http: //www. cs. ubc. ca/spider/lowe/vision. html

Consumer-Level Applications • Stitching: turning overlapping photos into a single seamlessly stitched panorama (Figure

Consumer-Level Applications • Stitching: turning overlapping photos into a single seamlessly stitched panorama (Figure 1. 5 a), as described in Chapter 9; • Exposure bracketing: merging multiple exposures taken under challenging lighting conditions (strong sunlight and shadows) into a single perfectly exposed image (Figure 1. 5 b), as described in Section 10. 2; • Morphing: turning a picture of one of your friends into another, using a seamless morph transition (Figure 1. 5 c);

 • 3 D modeling: converting one or more snapshots into a 3 D

• 3 D modeling: converting one or more snapshots into a 3 D model of the object or person you are photographing (Figure 1. 5 d), as described in Section 12. 6 • Video match move and stabilization: inserting 2 D pictures or 3 D models into your videos by automatically tracking nearby reference points (see Section 7. 4. 2) or using motion estimates to remove shake from your videos (see Section 8. 2. 1);

 • Photo-based walkthroughs: navigating a large collection of photographs, such as the interior

• Photo-based walkthroughs: navigating a large collection of photographs, such as the interior of your house, by flying between different photos in 3 D (see Sections 13. 1. 2 and 13. 5. 5) • Face detection: for improved camera focusing as well as more relevant image searching (see Section 14. 1. 1); • Visual authentication: automatically logging family members onto your home computer as they sit down in front of the webcam (see Section 14. 2).

1. 2 A Brief History • Computational theory: What is the goal of the

1. 2 A Brief History • Computational theory: What is the goal of the computation (task) and what are the constraints that are known or can be brought to bear on the problem? • Representations and algorithms: How are the input, output, and intermediate information represented and which algorithms are used to calculate the desired result?

1. 2 A Brief History • Hardware implementation: How are the representations and algorithms

1. 2 A Brief History • Hardware implementation: How are the representations and algorithms mapped onto actual hardware, e. g. , a biological vision system or a specialized piece of silicon? • Conversely, how can hardware constraints be used to guide the choice of representation and algorithm? • With the increasing use of graphics chips (GPUs) and many-core architectures for computer vision (see Section C. 2), this question is again becoming quite relevant.

1. 3 Book Overview

1. 3 Book Overview

This Course http: //www. csie. ntu. edu. tw/~fuh/vcourse/szeliski http: //www. cs. washington. edu/education/courses/cse 576/08

This Course http: //www. csie. ntu. edu. tw/~fuh/vcourse/szeliski http: //www. cs. washington. edu/education/courses/cse 576/08 sp/

Optional Project 1: Features

Optional Project 1: Features

Optional Project 2: Panorama Stitching http: //www. cs. washington. edu/education/courses/cse 576/05 sp/projects/proj 2/artifacts/winners. html

Optional Project 2: Panorama Stitching http: //www. cs. washington. edu/education/courses/cse 576/05 sp/projects/proj 2/artifacts/winners. html Indri Atmosukarto, 576 08 sp

Optional Project 3: Face Recognition

Optional Project 3: Face Recognition

Final Project Open-ended project of your choosing

Final Project Open-ended project of your choosing

General Comments Prerequisites—these are essential! • Data structures • A good working knowledge of

General Comments Prerequisites—these are essential! • Data structures • A good working knowledge of C and C++ programming – (or willingness/time to pick it up quickly!) • Linear algebra • Vector calculus Course does not assume prior imaging experience • computer vision, image processing, graphics, etc.

Project due Mar. 5 use correlation to do image matching find to minimize DC

Project due Mar. 5 use correlation to do image matching find to minimize DC & CV Lab.

DC & CV Lab. CSIE NTU

DC & CV Lab. CSIE NTU

DC & CV Lab. CSIE NTU

DC & CV Lab. CSIE NTU

DC & CV Lab. CSIE NTU

DC & CV Lab. CSIE NTU