Computer Vision Filename eie 426 computervision0809 ppt 2020925

Contents n n n n Perception generally Image formation Color vision Edge detection Image

Perception generally Stimulus (percept) S, World W S = g(W) E. g. , g

Better approaches Bayesian inference of world configurations: P(W|S) = P(S|W) x P(W) / P(S)

Vision “subsystems” Vision requires combining multiple cues 2020/9/25 EIE 426 -AICV 5

Image formation P is a point in the scene, with coordinates (X; Y; Z)

Len systems f : the focal length of the lens

Images (cont. ) I(x; y; t) is the intensity at (x; y) at time

Color vision Intensity varies with frequency infinite-dimensional signal Human eye has three types of

Color vision (cont. ) 2020/9/25 EIE 426 -AICV 11

Edge detection Edges are straight lines or curves in the image plane across which

Edge detection (cont. ) Edges in image discontinuities in scene: 1) Depth discontinuities 2)

Edge detection (cont. ) 2020/9/25 EIE 426 -AICV 14

Edge detection (cont. ) n Sobel operator the location of the origin (the image

Edge detection (cont. ) A color picture of a steam engine. 2020/9/25 The Sobel

Edge detection: application 1 An edge extraction based method to produce the penand-ink like

Edge detection: application 2 Leaf (vein pattern) characterization 2020/9/25 EIE 557 -CI&IA 18

Image segmentation n In computer vision, segmentation refers to the process of partitioning a

Image segmentation (cont. ) 2020/9/25 EIE 426 -AICV 20

Image segmentation: the quadtree partition based split-and-merge algorithm (1) (2) (3) Split into four

Image segmentation: the quadtree partition based split-and-merge algorithm (cont. ) 2020/9/25 EIE 426 -AICV

Visual attention n n Attention is the cognitive process of selectively concentrating on one

Visual attention (cont. ) n The visual attention mechanism may have at least the

Attention-driven object extraction The more attentive a object/region, the higher priority it has 2020/9/25

Attention-driven object extraction (cont. ) Objects 1, 2, …, background 2020/9/25 EIE 426 -AICV

Motion • The rate of apparent motion can tell us something about distance. A

Motion Estimation 2020/9/25 EIE 426 -AICV 28

Stereo The nearest point of the pyramid is shifted to the left in the

Disparity and depth 2020/9/25 EIE 426 -AICV 30

Disparity and depth (cont. ) Depth is inversely proportional to disparity.

Example: Electronic eyes for the blind 2020/9/25 EIE 426 -AICV 32

Example: Electronic eyes for the blind (cont. ) Farther object Nearer Left camera Right

Example: Electronic eyes for the blind (cont. ) Left: x=549 Right: x=476 ∆=73 Left:

Texture: a spatially repeating pattern on a surface that can be sensed visually. Examples:

Edge and vertex types • “+” and “-” labels represent convex and concave edges,

Object recognition Simple idea: - extract 3 -D shapes from image - match against

Biometric identification Criminal investigations and access control for restricted facilities require the ability to

Content-based image retrieval n n The application of computer vision to the image retrieval

Content-based image retrieval (cont. ) n http: //labs. systemone. at/retrievr/? sketch. Name=2009 -03 -26

Handwritten digit recognition 3 -nearest-neighbor = 2. 4% error 400 -300 -10 unit MLP

Summary Vision is hard -- noise, ambiguity, complexity Prior knowledge is essential to constrain

Slides: 42

Download presentation

Computer Vision Filename: eie 426 -computer-vision-0809. ppt 2020/9/25 EIE 426 -AICV 1

Contents n n n n Perception generally Image formation Color vision Edge detection Image segmentation Visual attention 2 D 3 D Object recognition 2020/9/25 EIE 426 -AICV 2

Perception generally Stimulus (percept) S, World W S = g(W) E. g. , g = “graphics. " Can we do vision as inverse graphics? W = g-1(S) Problem: massive ambiguity! Missing depth information! 2020/9/25 EIE 426 -AICV 3

Better approaches Bayesian inference of world configurations: P(W|S) = P(S|W) x P(W) / P(S) = α x P(S|W) x P(W) “graphics” “prior knowledge” Better still: no need to recover exact scene! Just extract information needed for n navigation n manipulation n recognition/identification 2020/9/25 EIE 426 -AICV 4

Vision “subsystems” Vision requires combining multiple cues 2020/9/25 EIE 426 -AICV 5

Image formation P is a point in the scene, with coordinates (X; Y; Z) P’ is its image on the image plane, with coordinates (x; y; z) x = -f. X/Z; y = -f. Y/Z (by similar triangles) Scale/distance is indeterminate! 2020/9/25 EIE 426 -AICV 6

Len systems f : the focal length of the lens

Images 2020/9/25 EIE 426 -AICV 8

Images (cont. ) I(x; y; t) is the intensity at (x; y) at time t CCD camera 4, 000 pixels; human eyes 240, 000 pixels 2020/9/25 EIE 426 -AICV 9

Color vision Intensity varies with frequency infinite-dimensional signal Human eye has three types of color-sensitive cells; each integrates the signal 3 -element vector intensity 2020/9/25 EIE 426 -AICV 10

Color vision (cont. ) 2020/9/25 EIE 426 -AICV 11

Edge detection Edges are straight lines or curves in the image plane across which there is “significant” changes in image brightness. The goal of edge detection is to abstract away from messy, multimegabyte image and towards a more compact, abstract representation. 2020/9/25 EIE 426 -AICV 12

Edge detection (cont. ) Edges in image discontinuities in scene: 1) Depth discontinuities 2) surface orientation 3) reflectance (surface markings) discontinuities 4) illumination discontinuities (shadows, etc. ) 2020/9/25 EIE 426 -AICV 13

Edge detection (cont. ) 2020/9/25 EIE 426 -AICV 14

Edge detection (cont. ) n Sobel operator the location of the origin (the image pixel to be processed) Other operators: Roberts (2 x 2), Prewitt (3 x 3), Isotropic (3 x 3)

Edge detection (cont. ) A color picture of a steam engine. 2020/9/25 The Sobel operator applied to that image. EIE 426 -AICV 16

Edge detection: application 1 An edge extraction based method to produce the penand-ink like drawings from photos 2020/9/25 EIE 426 -AICV 17

Edge detection: application 2 Leaf (vein pattern) characterization 2020/9/25 EIE 557 -CI&IA 18

Image segmentation n In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments (sets of pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc. ) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. 2020/9/25 EIE 426 -AICV 19

Image segmentation (cont. ) 2020/9/25 EIE 426 -AICV 20

Image segmentation: the quadtree partition based split-and-merge algorithm (1) (2) (3) Split into four disjoined quadrants any region Ri where P(Ri) = FALSE. Merge any adjacent regions Ri and Rk for which P(Ri Rk ) = TRUE; and Stop when no further merging or splitting is possible. P(Ri) = TRUE if all pixels in Ri have the same intensity or are uniform in some measure. 2020/9/25 EIE 426 -AICV 21

Image segmentation: the quadtree partition based split-and-merge algorithm (cont. ) 2020/9/25 EIE 426 -AICV 22

Visual attention n n Attention is the cognitive process of selectively concentrating on one aspect of the environment while ignoring other things. Attention mechanism of human vision system has been applied to serve machine visual system for sampling data nonuniformly and utilizing its computational resources efficiently. 2020/9/25 EIE 426 -AICV 23

Visual attention (cont. ) n The visual attention mechanism may have at least the following basic components: (1) the selection of a region of interest in the visual field; (2) the selection of feature dimensions and values of interest; (3) the control of information flow through the network of neurons that constitutes the visual system; and (4) the shifting from one selected region to the next in time. 2020/9/25 EIE 426 -AICV 24

Attention-driven object extraction The more attentive a object/region, the higher priority it has 2020/9/25 EIE 426 -AICV 25

Attention-driven object extraction (cont. ) Objects 1, 2, …, background 2020/9/25 EIE 426 -AICV 26

Motion • The rate of apparent motion can tell us something about distance. A nearer object has a larger motion. • Object tracking 2020/9/25 EIE 426 -AICV 27

Motion Estimation 2020/9/25 EIE 426 -AICV 28

Stereo The nearest point of the pyramid is shifted to the left in the right image and to the right in the left image. Disparity (x difference in two images) Depth 2020/9/25 EIE 426 -AICV 29

Disparity and depth 2020/9/25 EIE 426 -AICV 30

Disparity and depth (cont. ) Depth is inversely proportional to disparity.

Example: Electronic eyes for the blind 2020/9/25 EIE 426 -AICV 32

Example: Electronic eyes for the blind (cont. ) Farther object Nearer Left camera Right camera Left captured image Right captured image Pixels matching for calculating the disparities 2020/9/25 EIE 426 -AICV 33

Example: Electronic eyes for the blind (cont. ) Left: x=549 Right: x=476 ∆=73 Left: x=333 Right: x=273 ∆=60 2020/9/25 EIE 426 -AICV 34

Texture: a spatially repeating pattern on a surface that can be sensed visually. Examples: the pattern windows on a building, the stitches on a sweater, The spots on a leopard’s skin, grass on a lawn, etc. 2020/9/25 EIE 426 -AICV 35

Edge and vertex types • “+” and “-” labels represent convex and concave edges, respectively. These are associated with surface normal discontinuities wherein both surfaces that meet along the edge are visible. • A “ ” or a “ ” represents an occluding convex edge. As one moves in the direction of the arrow, the (visible) surfaces are to the right. • A “ ” or a “ ” represents a limb. Here, the surface curves smoothly around to occlude itself. As one moves in the direction of the twin arrow, the (visible) surfaces lies to the right. 2020/9/25 EIE 426 -AICV 36

Object recognition Simple idea: - extract 3 -D shapes from image - match against “shape library” Problems: - extracting curved surfaces from image - representing shape of extracted object - representing shape and variability of library object classes - improper segmentation, occlusion - unknown illumination, shadows, markings, noise, complexity, etc. Approaches: - index into library by measuring invariant properties of objects - alignment of image feature with projected library object feature - match image against multiple stored views (aspects) of library object - machine learning methods based on image statistics 2020/9/25 EIE 426 -AICV 37

Biometric identification Criminal investigations and access control for restricted facilities require the ability to indentify unique individuals. (the blueish area) 2020/9/25 EIE 426 -AICV 38

Content-based image retrieval n n The application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases. “Content-based” means that the search will analyze the actual contents of the image. The term ‘content’ in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. Without the ability to examine image content, searches must rely on metadata such as captions or keywords, which may be laborious or expensive to produce. 2020/9/25 EIE 426 -AICV 39

Content-based image retrieval (cont. ) n http: //labs. systemone. at/retrievr/? sketch. Name=2009 -03 -26 -01 -22 -37 -828150. 3#sketch. Name=2009 -03 -26 -01 -2335 -358087. 4 2020/9/25 EIE 426 -AICV 40

Handwritten digit recognition 3 -nearest-neighbor = 2. 4% error 400 -300 -10 unit MLP (a neural network approach) = 1. 6% error Le. Net: 768 -192 -30 -10 unit MLP = 0. 9% error 2020/9/25 EIE 426 -AICV 41

Summary Vision is hard -- noise, ambiguity, complexity Prior knowledge is essential to constrain the problem Need to combine multiple cues: motion, contour, shading, texture, stereo “Library” object representation: shape vs. aspects Image/object matching: features, lines, regions, etc. 2020/9/25 EIE 426 -AICV 42