The Brightness Constraint Brightness Constancy Equation Linearizing assuming

The Brightness Constraint Brightness Constancy Equation: Linearizing (assuming small (u, v)): Where: I = I ( x, y) - J( x, y) Each pixel provides 1 equationt in 2 unknowns (u, v). Insufficient info. Another constraint: Global Motion Model Constraint

The 2 D/3 D Dichotomy Requires prior model selection 3 D Camera motion Image motion + Camera induced motion = + Independent motions = 3 D Scene structure + Independent motions 2 D techniques Do not model “ 3 D scenes” 3 D techniques Singularities in “ 2 D scenes”

The 2 D/3 D Dichotomy ) r K e x h i atr s eit ) m er R n o et d K i t ra ram an b li pa H a The only part c n e e n n e w with 3 D depth o Pla etw n k or e b n information u c ( on en e s ainfo? ti fer When cannot recover anyca 3 D t if o d d r e D t e a 3 h t 1. r l ib ver tel l a o t c o c n 2. e nn u r he not e ca t 3. Planar. Inscene: n aus a C ec (b

Global Motion Models 2 D Models: Relevant for: *Airborne video (distant scene) Surveillance (distant scene) • 2 D Similarity* Remote * 2 D Models are easier to * Camera on tripod (pure Zoom/Rotation) estimate than 3 D models • 2 D Affine (much fewer unknowns • Homography (2 D projective transformation) numerically more stable). * 2 D models provide when camera is translating, 3 D Models: Relevant dense correspondences. scene is near, and non-planar. • 3 D Rotation + 3 D Translation + Depth • Essential/Fundamental Matrix • Plane+Parallax

Example: Affine Motion Substituting into the B. C. Equation: Each pixel provides 1 linear constraint in 6 global unknowns Least Square (over all pixels): (minimum 6 pixels. Minimization necessary) Every pixel contributes Confidence-weighted regression

Example: Affine Motion Differentiating w. r. t. a 1 , …, a 6 and equating to zero 6 linear equations in 6 unknowns: Summation is over all the pixels in the image!

Coarse-to-Fine Estimation Parameter propagation: warp refine Jw pixels u=1. 25 + u=2. 5 pixels ==> small u and v. . . u=5 pixels image J Pyramid of image J u=10 pixels image I Pyramid of image I

Other 2 D Motion Models 2 D Projective – planar motion (Homography H)

Panoramic Mosaic Image Original video clip Alignment accuracy (between a pair of frames): error < 0. 1 pixel Generated Mosaic image

Video Removal Original Outliers Synthesized

Video Enhancement ORIGINAL ENHANCED

Direct Methods: Methods for motion and/or shape estimation, which recover the unknown parameters directly from image intensities. Error measure based on dense image quantities (Confidence-weighted regression; Exploits all available information) Feature-based Methods: Methods for motion and/or shape estimation based on feature matches (e. g. , SIFT, HOG). Error measure based on sparse distinct features (Features matches + RANSAC + Parameter estimation)

Benefits of Direct Methods • High subpixel accuracy. • Simultaneously estimate matches + transformation Do not need distinct features for image alignment: • Strong locking property.

Limitations of Direct Methods • Limited search range (up to ~10% of the image size). • Brightness constancy assumption.

Video Indexing and Editing

The 2 D/3 D Dichotomy Source of dichotomy: Camera-centric models (R, T, Z) Camera motion Camera induced motion Image motion = = + Scene structure + Independent motions 2 D techniques Do not model “ 3 D scenes” = + Independent motions 3 D techniques Singularities in “ 2 D scenes”

The Plane+Parallax Decomposition Move from CAMERA-centric to a SCENE-centric model Original Sequence The residual parallax lies on a radial (epipolar) field: epipole Plane-Stabilized Sequence

Benefits of the P+P Decomposition 1. Reduces the search space: • Eliminates effects of rotation • Eliminates changes in camera calibration parameters / zoom • Camera parameters: Need to estimate only the epipole. (i. e. , 2 unknowns) • Image displacements: Constrained to lie on radial lines (i. e. , reduces to a 1 D search problem) A result of aligning an existing structure in the image.

Benefits of the P+P Decomposition 2. Scene-Centered Representation: Translation or pure rotation ? ? ? Focus on relevant portion of info Remove global component which dilutes information !

Benefits of the P+P Decomposition 2. Scene-Centered Representation: Shape = Fluctuations relative to a planar surface in the scene STAB_RUG SEQ

Benefits of the P+P Decomposition 2. Scene-Centered Representation: Shape = Fluctuations relative to a planar surface in the scene • Height vs. Depth (e. g. , obstacle avoidance) • Appropriate units for shape - fewer bits, progressive encoding • A compact representation total distance [97. . 103] camera center scene global (100) component local [-3. . +3] component

Benefits of the P+P Decomposition 3. Stratified 2 D-3 D Representation: • Start with 2 D estimation (homography). • 3 D info builds on top of 2 D info. Avoids a-priori model selection.

Dense 3 D Reconstruction (Plane+Parallax) Original sequence Plane-aligned sequence Recovered shape

Dense 3 D Reconstruction (Plane+Parallax) Original sequence Recovered shape Plane-aligned sequence

P+P Correspondence Estimation 1. Eliminating Aperture Problem Brig htne ss C onst ancy Ep la o ip ne i l r cons train t p epipole The intersection of the two line constraints uniquely defines the displacement.

Multi-Frame vs. 2 -Frame Estimation 1. Eliminating Aperture Problem cy n ta s n o c nt i a tr other epipolar line s another epipole h g i r B s s e tn Ep n Co la o ip ne i l r p epipole The other epipole resolves the ambiguity ! two line constraints are parallel ==> do NOT intersect