Computer Vision Segmentation Marc Pollefeys COMP 256 Some

  • Slides: 68
Download presentation
Computer Vision Segmentation Marc Pollefeys COMP 256 Some slides and illustrations from D. Forsyth,

Computer Vision Segmentation Marc Pollefeys COMP 256 Some slides and illustrations from D. Forsyth, T. Darrel, . . .

Computer Vision Should have been last week ; -) Sequential structure from motion •

Computer Vision Should have been last week ; -) Sequential structure from motion • Initialize motion from two images • Initialize structure • For each additional view – Determine pose of camera – Refine and extend structure • Refine structure and motion 2

Computer Vision Initial projective camera motion • Choose P and P´compatible with F (reference

Computer Vision Initial projective camera motion • Choose P and P´compatible with F (reference plane; arbitrary) Reconstruction up to projective ambiguity Same for more views? different projective basis 3 • Initialize motion • Initialize structure • For each additional view • Determine pose of camera • Refine and extend structure • Refine structure and motion

Computer Vision Initializing projective structure • Reconstruct matches in projective frame by minimizing the

Computer Vision Initializing projective structure • Reconstruct matches in projective frame by minimizing the reprojection error Non-iterative optimal solution 4 • Initialize motion • Initialize structure • For each additional view • Determine pose of camera • Refine and extend structure • Refine structure and motion

Computer Vision Projective pose estimation • Infere 2 D-3 D matches from 2 D-2

Computer Vision Projective pose estimation • Infere 2 D-3 D matches from 2 D-2 D matches • Compute pose from (RANSAC, 6 pts) X F x Inliers: 5 • Initialize motion • Initialize structure • For each additional view • Determine pose of camera • Refine and extend structure • Refine structure and motion

Computer Vision Refining and extending structure • Refining structure (Iterative linear) • Extending structure

Computer Vision Refining and extending structure • Refining structure (Iterative linear) • Extending structure 2 -view triangulation • Initialize motion • Initialize structure • For each additional view 6 • Determine pose of camera • Refine and extend structure • Refine structure and motion

Computer Vision Refining structure and motion • use bundle adjustment Also model radial distortion

Computer Vision Refining structure and motion • use bundle adjustment Also model radial distortion to avoid bias! 7

Computer Vision 8

Computer Vision 8

Computer Vision 9 Application: video augmentation

Computer Vision 9 Application: video augmentation

Tentative class schedule Computer Vision Jan 16/18 - Introduction Jan 23/25 Cameras Radiometry Sources

Tentative class schedule Computer Vision Jan 16/18 - Introduction Jan 23/25 Cameras Radiometry Sources & Shadows Color Feb 6/8 Linear filters & edges Texture Feb 13/15 Multi-View Geometry Stereo Feb 20/22 Optical flow Project proposals Affine Sf. M Projective Sf. M Camera Calibration Segmentation Mar 13/15 Springbreak Mar 20/22 Fitting Prob. Segmentation Mar 27/29 Silhouettes and Photoconsistency Linear tracking Apr 3/5 Project Update Non-linear Tracking Apr 10/12 Object Recognition Apr 17/19 Range data Final project Jan 30/Feb 1 Feb 27/Mar 1 Mar 6/8 10 Apr 24/26

Computer Vision Segmentation and Grouping • Motivation: not information is evidence • Obtain a

Computer Vision Segmentation and Grouping • Motivation: not information is evidence • Obtain a compact representation from an image/motion sequence/set of tokens • Should support application • Broad theory is absent at present 11 • Grouping (or clustering) – collect together tokens that “belong together” • Fitting – associate a model with tokens – issues • which model? • which token goes to which element? • how many elements in the model?

Computer Vision General ideas • tokens • bottom up segmentation – whatever we need

Computer Vision General ideas • tokens • bottom up segmentation – whatever we need to group (pixels, points, surface elements, etc. ) • top down segmentation – tokens belong together because they lie on the same object 12 – tokens belong together because they are locally coherent • These two are not mutually exclusive

Computer Vision Why do these tokens belong together? 13

Computer Vision Why do these tokens belong together? 13

Computer Vision 14

Computer Vision 14

Computer Vision Basic ideas of grouping in humans • Figure-ground discrimination – grouping can

Computer Vision Basic ideas of grouping in humans • Figure-ground discrimination – grouping can be seen in terms of allocating some elements to a figure, some to ground – impoverished theory • Gestalt properties – elements in a collection of elements can have properties that result from relationships (Muller-Lyer effect) • gestaltqualitat – A series of factors affect whether elements should be grouped together • Gestalt factors 15

Computer Vision 16

Computer Vision 16

Computer Vision 17

Computer Vision 17

Computer Vision 18

Computer Vision 18

Computer Vision 19

Computer Vision 19

Computer Vision 20

Computer Vision 20

Computer Vision 21

Computer Vision 21

Computer Vision 22

Computer Vision 22

Computer Vision Technique: Shot Boundary Detection • Find the shots in a sequence of

Computer Vision Technique: Shot Boundary Detection • Find the shots in a sequence of video – shot boundaries usually result in big differences between succeeding frames • Strategy: – compute interframe distances – declare a boundary where these are big 23 • Possible distances – frame differences – histogram differences – block comparisons – edge differences • Applications: – representation for movies, or video sequences • find shot boundaries • obtain “most representative” frame – supports search

Computer Vision Technique: Background Subtraction • If we know what the background looks like,

Computer Vision Technique: Background Subtraction • If we know what the background looks like, it is easy to identify “interesting bits” • Applications – Person in an office – Tracking cars on a road – surveillance 24 • Approach: – use a moving average to estimate background image – subtract from current frame – large absolute values are interesting pixels • trick: use morphological operations to clean up pixels

Computer Vision 25

Computer Vision 25

Computer Vision 26

Computer Vision 26

Computer Vision 27

Computer Vision 27

Computer Vision 28

Computer Vision 28

Computer Vision 29

Computer Vision 29

Computer Vision 30

Computer Vision 30

Computer Vision 31

Computer Vision 31

Computer Vision 32

Computer Vision 32

Computer Vision 33

Computer Vision 33

Computer Vision Segmentation as clustering • Cluster together (pixels, • Point-Cluster distance tokens, etc.

Computer Vision Segmentation as clustering • Cluster together (pixels, • Point-Cluster distance tokens, etc. ) that – single-link clustering belong together – complete-link clustering • Agglomerative – group-average clustering – attach closest to cluster it is closest to – repeat • Divisive clustering – split cluster along best boundary – repeat 34 clustering • Dendrograms – yield a picture of output as clustering process continues

Computer Vision 35 Simple clustering algorithms

Computer Vision 35 Simple clustering algorithms

Computer Vision 36

Computer Vision 36

Computer Vision K-Means • Choose a fixed number of clusters • Choose cluster centers

Computer Vision K-Means • Choose a fixed number of clusters • Choose cluster centers and point-cluster allocations to minimize error • can’t do this by search, because there are too many possible allocations. 37

Computer Vision Image Clusters on intensity Clusters on color K-means clustering using intensity alone

Computer Vision Image Clusters on intensity Clusters on color K-means clustering using intensity alone and color alone 38

Computer Vision Image Clusters on color K-means using color alone, 11 segments 39

Computer Vision Image Clusters on color K-means using color alone, 11 segments 39

Computer Vision K-means using color alone, 11 segments. 40

Computer Vision K-means using color alone, 11 segments. 40

Computer Vision K-means using colour and position, 20 segments 41

Computer Vision K-means using colour and position, 20 segments 41

Computer Vision Graph theoretic clustering • Represent tokens using a weighted graph. – affinity

Computer Vision Graph theoretic clustering • Represent tokens using a weighted graph. – affinity matrix • Cut up this graph to get subgraphs with strong interior links 42

Computer Vision 43

Computer Vision 43

Computer Vision 44

Computer Vision 44

Computer Vision 45

Computer Vision 45

Computer Vision 46

Computer Vision 46

Computer Vision 47

Computer Vision 47

Computer Vision Measuring Affinity Intensity Distance Texture 48

Computer Vision Measuring Affinity Intensity Distance Texture 48

Computer Vision 49 Scale affects affinity

Computer Vision 49 Scale affects affinity

Computer Vision Eigenvectors and cuts • Simplest idea: we want a vector a giving

Computer Vision Eigenvectors and cuts • Simplest idea: we want a vector a giving the association between each element and a cluster • We want elements within this cluster to, on the whole, have strong affinity with one another • We could maximize • But need the constraint • This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue 50

Computer Vision Example eigenvector points eigenvector matrix 51

Computer Vision Example eigenvector points eigenvector matrix 51

Computer Vision 52

Computer Vision 52

Computer Vision More than two segments • Two options – Recursively split each side

Computer Vision More than two segments • Two options – Recursively split each side to get a tree, continuing till the eigenvalues are too small – Use the other eigenvectors 53

Computer Vision 54 More than two segments

Computer Vision 54 More than two segments

Computer Vision Normalized cuts • Current criterion evaluates within cluster similarity, but not across

Computer Vision Normalized cuts • Current criterion evaluates within cluster similarity, but not across cluster difference • Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference • Write graph as V, one cluster as A and the other as B • Maximize • i. e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph 55

Computer Vision 56

Computer Vision 56

Computer Vision 57

Computer Vision 57

Computer Vision Normalized cuts • Write a vector y whose • This is hard

Computer Vision Normalized cuts • Write a vector y whose • This is hard to do, elements are 1 if item is because y’s values in A, -b if it’s in B are quantized • Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones. • Criterion becomes • and we have a constraint 58

Computer Vision 59

Computer Vision 59

Computer Vision Normalized cuts • Instead, solve the generalized eigenvalue problem • which gives

Computer Vision Normalized cuts • Instead, solve the generalized eigenvalue problem • which gives • Now look for a quantization threshold that maximises the criterion --- i. e all components of y above that threshold go to one, all below go to -b 60

Computer Vision Figure from “Image and video segmentation: the normalised cut framework”, by Shi

Computer Vision Figure from “Image and video segmentation: the normalised cut framework”, by Shi and Malik, copyright IEEE, 1998 61

Computer Vision Figure from “Normalized cuts and image segmentation, ” Shi and Malik, copyright

Computer Vision Figure from “Normalized cuts and image segmentation, ” Shi and Malik, copyright IEEE, 2000 62

Computer Vision Image parsing: Unifying Segmentation, Detection and Recognition Tu, Chen, Yuille & Zhu

Computer Vision Image parsing: Unifying Segmentation, Detection and Recognition Tu, Chen, Yuille & Zhu ICCV’ 03 63

Computer Vision Image parsing • Generative models for each class some random samples for

Computer Vision Image parsing • Generative models for each class some random samples for 5 and H some random face samples (PCA model) 64

Computer Vision 65 DDMCMC Data Driven Markov Chain Monte Carlo

Computer Vision 65 DDMCMC Data Driven Markov Chain Monte Carlo

Computer Vision 66 Image parsing

Computer Vision 66 Image parsing

Computer Vision 67 Image parsing

Computer Vision 67 Image parsing

Computer Vision 68 Next class: Fitting Reading: Chapter 15

Computer Vision 68 Next class: Fitting Reading: Chapter 15