Binocular Stereo Topics Principle basic equation epipolar line
Binocular Stereo
Topics • Principle • basic equation • epipolar line • features and strategies for matching • Case study • Block matching • Relaxation • DP stereo
Basic principles
Binocular stereo single image is ambiguous A a’ a” another image taken from a different direction gives the unique 3 D point
Epipolar line constraints Epipolar line One image point Possible line of sight Base line Epipolar plane Corresponding points lie on the Epipolar lines Epipolar line constratints
Epipoles • intersections of baseline with image planes • projection of the optical center in another image • the vanishing points of camera motion direction C 1 e 2 C 2
Examples of epipolar lines
Examples of epipolar lines
Examples of epipolar lines
Rectification • rectification
Terminology A physical point left image point right image plane left image center focal length right image center z World coordinate system base line length
Pinhole Camera
Perspective Projection Y X (u, v) View point (Optical center) v (X, Y, Z) u f : focal length -Z Image plane
Basic binocular stereo equation d+x d-x -z x’ x” d z z=-2 df/(x”-x’) x”-x’: disparity 2 d : base line length d f
Features for matching a. brightness b. edges c. edge intervals d. interest points 10 11 12 11 15 16
Strategies for matching a. relaxation 10 10 5 10 10 10 10 10 b. coarse to fine c. dynamic programming global optimam local optimam
Classification of stereo methods • Features for matching • brightness value • point • edge • region • Strategies for matching • brute-force • coarse-to-fine • relaxation • dynamic programming • Constraints for matching • epipolar lines • disparity limit • continuity • uniqueness
Case study
Block-Matching Stereo 1. method b c 2. problem a. trade-off of window size and resolution b. dull peak b c
Cost Function d Near Object Background Near Object left (a) SAD (sum. of absolute difference) (b) SSD (sum. of squared difference) (c) Correlation Background right
Moravec Stereo(`79) navigation Moravec “Visual mapping by a robot rover” Proc 6 th IJCAI, pp. 598 -600 (1979)
Moravec’s cart Slide stereo Motion stereo
Slider stereo (9 eyes stereo) u = 36 stereo pairs!!! each stereo has an uncertainty measure uncertainty = 1 / base-line u each stereo has a confidence measure u u 9 C 2 long base line large uncertainty
Coarse to fine matching expand matching
area: confidence measure σ estimated distance σ: uncertainty measure 9 C 2 = 36 curves Interest point
Moravec Stereo(`81) 1. Features for matching a. brightness value b. point c. edge d. region interest point 2. Strategies for matching a. brute-force (not a strategy ? ? ? ) b. coarse-to-fine c. relaxation d. dynamic programming 3. Constraints for matching a. epipolar lines b. disparity limit c. continuity d. Uniqueness Purpose: navigation (Stanford)
Recent Progress disparity Near Object Background Near Object left Background right How to estimate the disparities? Minimize some cost function along the epipolar line H. Hirschnuller, "Improvements in Real-Time Correlation-Based Stereo Vision", IEEE Workshop on Stereo and Multi-Baseline Vision, 2001 disparity
Fatting Effect on Object Boundary u No single window fits at the discontinuity ⇒ Fatting effect of the object Foreground Correspondence Near Object Background left Near Object Background Correspondence Background right
Accurate Estimation on Object Boundary u Shiftable Window d c 1 c 3 Near Object c 0 c 2 c 1 c 4 c 3 Background Near Object are used. Background right left Min{c 1, c 2, c 3, c 4} Min{{c 1, c 2, c 3, c 4}-c’} disparity c 0
Consistency Checking Check if two independent disparity estimation coincide u – – u Left ⇒Right search Right ⇒Left search Inconsistent disparities are considered as a false match Epipolar line 1 st search Epipolar line 2 nd search Check if they coincide left right
Result u SAD with 11 x 11 window u Shiftable window + consistency checking Left image Disparity map
Cooperative stereo: Marr-Poggio Stereo(`76) Simulating human visual system (random dot stereo gram) Marr, Poggio “Cooperative computation of stereo disparity” Science 194, 283 -287
Input : random dot stereo left image random dot shift the catch pat right image we can see the height different between the central and peripheral area
Constraints – Epipolar line constraint – Uniqueness constraint » each point in a image has only one depth value O. K. No. – Continuity constraint » each point is almost sure to have a depth value near the values of neighbors O. K. No.
D E F A A B C B D C E F Uniqueness constraint prohibits two or more matching points on one horizontal or vertical lines (E-A) A (E-B) B prohibit C (E-C) continuity constraint attracts more matching on a diagonal line (D-A) attract (E-B) (F-C) Same depth
Relaxation 10 10 5 10 10 10 10 10 n+1
Marr-Poggio Stereo (`76) 1. Features for matching a. brightness value b. point c. edge d. region 2. Strategies for matching a. brute-force b. coarse-to-fine c. relaxation d. dynamic programming 3. Constraints for matching a. epipolar lines b. disparity limit c. continuity d. uniqueness Purpose: simulate the human visual system (MIT)
Recent progress: Graph-cut u Solve graph partition problem in globally optimal way 1. Formulate the problem in energy minimization framework 2. Design a graph such that the sum of cut edges equals to the total energy 3. Find a “Cut” that minimizes the energy Cut i Vij j
Example of Graph-cut u Image segmentation Graph partition Graph={N, e} Ni: Graph node eij: Edge connecting nodes C: Cut Ni Nj Vij Segmentation Image={pixel} Vij: Similarity between neighboring pixels Foreground/Background boundary
Solution to a Graph-cut Problem u Min-Cut/Max-Flow algorithm 1. Given source (s) and sink nodes (t) 2. Define capacity on each edge 3. Find the maximum flow from s⇒t, satisfying capacity constraints, and cut the bottleneck Flow Source i Bottleneck j Sink Min-Cut = Max-Flow Yuri Boykov, Vladimir Kolmogorov, "An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision", PAMI, 2004
Multi-label Problem u Find the labeling f that minimizes the energy measures the extent to which f is not piece wise smooth measures the disagreement between f and the observed data Measures how well label fp fits pixel p given the observed data Smoothness penalty between adjacent (N) pixels Yuri Boykov and Olga Veksler and Ramin Zabih, “Fast Approximate Energy Minimization via Graph Cuts, ” ICCV, 2001
Multi-label Solution via Graph-cut u u Iterative graph-cut approach 2 types of move algorithm are proposed – αβ-swap Minimize E under cond. γ β Graph-cut α – is preserved α-expansion γ Minimize E under cond. β α Graph-cut can be changed to α
αβ-swap Algorithm 1. Start with an arbitrary labeling f 2. Success: = 0 3. For each pair of labels a. b. Find f’=arg min E(f’) among f’ within one αβ-swap of f If E(f’) < E(f) then f’: =f and success: =1 4. If success=1 goto 2 5. Return f γ β α
αβ-swap Graph Structure edge α p q r weight for s should be a semi metric β
αβ-swap Cut u 3 possible cases α α α Cut α p α q β p β Cut β β q β p α Cut β q
α-expansion Algorithm 1. Start with an arbitrary labeling f 2. Success: = 0 3. For each label a. b. Find f’=arg min E(f’) among f’ within one α-expansion of f If E(f’) < E(f) then f’: =f and success: =1 4. If success=1 goto 2 5. Return f γ β α
α-expansion Graph Structure edge α p a q r b weight for s should be a metric Auxiliary nodes are added at the boundary of sets P where
α-expansion Cut u 3 Because V(a, b) is a metric V(a, b) < V(a, c)+V(c, b) possible cases α α α Cut α p a Cut α q p q a p Cut Never happens! a α q
Pixel-based Stereo Matching via Graph-cut left right p Label: 1 2 L Image Labeling f means disparity assignment minimizing the cost function by iterative graph cut (α-expansion)
Graph-cut with Occlusions u u u Occluded pixels are handled explicitly in the graph Find a subset of A Find a configuration f such that if the pixels (p, q) correspond otherwise left p is an occluded pixel p right q V. Kolmogorov and R. Zabih, “Computing visual correspondence with occlusions via graph cuts, ” ICCV, 2001
Energy Function to Minimize Number of pixels paired with p Occlusion penalty T(a) is 1 if a=true, otherwise 0
Minimization Flow Choose α s Current assignment disparity 0 disparity 1 right w x y z Possible assignment after α-expansion disparity α (=2) left p q r α <p, w> <q, y> <p, y> <r, z> Partition via Graph-cut (α-expansion) <q, z> • Construct a Graph • Assign weight to each edge
Result Left Image L 1 correlation Ground Truth Graph-cut without Occ. Graph-cut with Occ.
DP stereo Ohta-Kanade Stereo(`85) Map making Ohta, Kanade “Stereo by intra- and inter-scanline search using dynamic programming” , IEEE Trans. , Vol. PAMI-7, No. 2, pp. 139 -14
now matching become 1 D to 1 D L 1 L 2 L 3 L 4 L 5 L 6 R 1 R 2 R 3 R 4 R 5 R 6 L disparity R yet, N line * ML * MR (512 * 100 * 10 m sec = 15 hours)
Path Search u Matching problem can be considered as a path search problem u define a cost at each candidate of path segment based some ad-hoc function 10 100
Dynamic programming We can formalize the path finding problem as the following iterative formula optimum cost to K 3 0 2 1 cost between M and K Optimum costs are known
stereo pair edges
path disparity depth
stereo pair edges depth
Ohta-Kanade Stereo(`85) 1. Features for matching a. brightness value b. point c. edge d. region Brightness of interval 2. Strategies for matching a. brute-force b. coarse-to-fine c. relaxation d. dynamic programming 3. Constraints for matching a. epipolar lines b. disparity limit c. continuity d. uniqueness aerial image analysis (CMU)
Recent progress: 4 -move, 4 -plane DP u Occluded pixels are handled explicitly in 4 -move, 4 plane representation u Disparity map is calculated under DP Matching (global energy minimization) A. Criminisi; J. Shotton; A. Blake; C. Rother; P. H. S. Torr, “Efficient Dense-Stereo and Novel-view Synthesis for Gaze Manipulation in One-to-one Teleconferencing, ” MSR-TR-2003 -59, 2003
Conventional 3 -move DP Visible only from right ov m ed ch at M Right scan-line Left occluded move e Right occluded move True matching path Approximated matching path Visible from both left and right Left scan-line Problem Occluded path and visible path cannot be distinguished in this representation
4 -move DP True matching path Occluded move (r) Occluded move(l) Right scanline Left scanline Matched move(l) Matched move (r) Approximated matching path Occluded path and visible path are handled separately
Design of Move Transition Lo Rm Lm Ro Left Occluded Move Lm Left Matched Move Ro Right Occluded Move Lo Rm Right Matched Move Matching Cost Normalized Sum of Squared Difference
4 -move, 4 -plane DP u Each node should hold 4 accumulation costs separately for each move ⇒ 4 -plane model
Inter-scanline Consistency u Propagate information across scan-lines ⇒ Gaussian filter is applied on Matching cost array Without Gaussian filter With Gaussian filter Right Occluded Pixels Left Occluded Pixels
Result Input Images 3 -move DP 4 -move, 4 -plane DP Right Occluded Pixels Left Occluded Pixels
Comparison in Severe Situation Input left Result right Texture-less region Block Matching Occluded Pixels Graph-cut with occlusions 4 -move, 4 -plane DP
Summary 1. Two images from two different positions 2. give depth information 2. Epipolar line and plane 3. Basic equation Z=-2 df/(x”-x’) x”-x’: disparity 2 d : base line length 4. case study Block-matching-based stereo Cooperative stereo DP based stereo
Recent directions 1. Block Matching (Local optimization) – Fast for real-time applications (parallel processing, SIMD) – Not accurate in texture-less region 2. Graph-cut variants (Global optimization) u globally consistent disparity map is obtained u texture-less region is interpolated nicely 3. 4. 4 -move, 4 -plane, DP Matching (Global optimization) u globally consistent disparity map is obtained u texture-less region is interpolated nicely u Inter scan-line inconsistency is reduced, but yet to be seen 4. Belief Propagation
- Slides: 73