The Brightness Constraint Brightness Constancy Equation Linearizing assuming
The Brightness Constraint Brightness Constancy Equation: Linearizing (assuming small (u, v)): Where: I t = I ( x, y) - J( x, y) Each pixel provides 1 equation in 2 unknowns (u, v). Insufficient info. Another constraint: Global Motion Model Constraint
The 2 D/3 D Dichotomy Requires prior model selection Camera motion Camera induced motion Image motion = = + Scene structure + Independent motions 2 D techniques Do not model “ 3 D scenes” = + Independent motions 3 D techniques Singularities in “ 2 D scenes”
Global Motion Models 2 D Models: • Affine • Quadratic • Homography (Planar projective transform) 3 D Models: • Rotation, Translation, 1/Depth • Instantaneous camera motion models • Essential/Fundamental Matrix • Plane+Parallax
Example: Affine Motion Substituting into the B. C. Equation: Each pixel provides 1 linear constraint in 6 global unknowns (minimum 6 pixels necessary) Least Square Minimization (over all pixels): Every pixel contributes Confidence-weighted regression
Example: Affine Motion Differentiating w. r. t. a 1 , …, a 6 and equating to zero 6 linear equations in 6 unknowns:
Coarse-to-Fine Estimation Parameter propagation: warp refine Jw pixels u=1. 25 + u=2. 5 pixels ==> small u and v. . . u=5 pixels image J Pyramid of image J u=10 pixels image I Pyramid of image I
Other 2 D Motion Models Quadratic – instantaneous approximation to planar motion Projective – exact planar motion (Homography H)
Panoramic Mosaic Image Original video clip Alignment accuracy (between a pair of frames): error < 0. 1 pixel Generated Mosaic image
Video Removal Original Outliers Synthesized
Video Enhancement ORIGINAL ENHANCED
Direct Methods: Methods for motion and/or shape estimation, which recover the unknown parameters directly from measurable image quantities at each pixel in the image. Minimization step: Direct methods: Error measure based on dense measurable image quantities (Confidence-weighted regression; Exploits all available information) Feature-based methods: Error measure based on distances of a sparse set of distinct feature matches.
Example: The SIFT Descriptor – Compute gradient orientation histograms of several small windows (128 values for each point) Image gradients The descriptor (4 x 4 array of 8 -bin histograms) – Normalize the descriptor to make it invariant to intensity change – To add Scale & Rotation invariance: Determine local scale (by maximizing Do. G in scale and in space), local orientation as the dominant gradient direction. • • Compute descriptors in each image Find descriptors matches across images Estimate transformation between the pair of images. In case of multiple motions: Use RANSAC (Random Sampling and Consensus) to compute Affine-transformation / Homography / Essential-Matrix / etc. D. Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004
Benefits of Direct Methods • High subpixel accuracy. • Simultaneously estimate matches + transformation Do not need distinct features. • Strong locking property.
Limitations • Limited search range (up to ~10% of the image size). • Brightness constancy assumption.
Video Indexing and Editing
The 2 D/3 D Dichotomy A camera-centric coordinate system (R, T, Z) Camera motion Camera induced motion Image motion = = + Scene structure + Independent motions 2 D techniques Do not model “ 3 D scenes” = + Independent motions 3 D techniques Singularities in “ 2 D scenes”
The Plane+Parallax Decomposition Original Sequence The residual parallax lies on a radial (epipolar) field: epipole Plane-Stabilized Sequence
Benefits of the P+P Decomposition 1. Reduces the search space: • Eliminates effects of rotation • Eliminates changes in camera parameters / zoom • Camera parameters: Need to estimate only epipole. (gauge ambiguity: unknown scale of epipole) • Image displacements: Constrained to lie on radial lines (1 -D search problem) A result of aligning an existing structure in the image.
Benefits of the P+P Decomposition 2. Scene-Centered Representation: Translation or pure rotation ? ? ? Focus on relevant portion of info Remove global component which dilutes information !
Benefits of the P+P Decomposition 2. Scene-Centered Representation: Shape = Fluctuations relative to a planar surface in the scene STAB_RUG SEQ
Benefits of the P+P Decomposition 2. Scene-Centered Representation: Shape = Fluctuations relative to a planar surface in the scene • Height vs. Depth (e. g. , obstacle avoidance) • Appropriate units for shape - fewer bits, progressive encoding • A compact representation total distance [97. . 103] camera center scene global (100) component local [-3. . +3] component
Benefits of the P+P Decomposition 3. Stratified 2 D-3 D Representation: • Start with 2 D estimation (homography). • 3 D info builds on top of 2 D info. Avoids a-priori model selection.
Dense 3 D Reconstruction (Plane+Parallax) Original sequence Plane-aligned sequence Recovered shape
Dense 3 D Reconstruction (Plane+Parallax) Original sequence Plane-aligned sequence Recovered shape
Dense 3 D Reconstruction (Plane+Parallax) Original sequence Recovered shape Plane-aligned sequence
P+P Correspondence Estimation 1. Eliminating Aperture Problem Brig htne s s Co a l o ip nsta ne i l r ncy cons train t p Ep epipole The intersection of the two line constraints uniquely defines the displacement.
Multi-Frame vs. 2 -Frame Estimation 1. Eliminating Aperture Problem t in a r st on c y another r epipole B s s e tn h g i nc a t ns o C a l o ip ne i l r other epipolar line p Ep epipole The other epipole resolves the ambiguity ! two line constraints are parallel ==> do NOT intersect
- Slides: 27