ShapeStructure From X 2 D 2 5 D

Shape(Structure) From X 解决的是从2 D图像到 2. 5 D表面形状(场景深度) 的问题 Shape from motion Shape from

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 In reality, instead of solving to minimize , we seek

8 -point algorithm To enforce that E is of rank 2, E is replaced

8 -point algorithm % Build the constraint matrix A = [x 2(1, : )‘.

Problem with 8 -point algorithm ~10000 ~100 ~10 0 0 Orders of magnitude difference

Normalized 8 -point algorithm normalized least squares yields good results Transform image to ~[-1,

Normalized 8 -point algorithm [x 1, T 1] = normalise 2 dpts(x 1); [x

Normalization function [newpts, T] = normalise 2 dpts(pts) c = mean(pts(1: 2, : )')';

RANSAC repeat select minimal sample (8 matches) compute solution(s) for F determine inliers until

Structure from motion Unknown camera viewpoints structure for motion: automatic recovery of camera motion

坐标转换 Model-view Transformation Camera Coordinate System World Coordinate System

世界坐标系相机坐标系 Camera Parameter Camera Projection Matrix Intrinsic Extrinsic

Image 1 Image 3 R 3, t 3 R 1, t 1 Image 2

Point 1 Point 2 Point 3 Image 1 Image 2 Image 3 Same Camera

Triangulation Image 1 Image 3 R 3, t 3 R 1, t 1 Image

相机内部参数矩阵 • Principle point offset – especially when images are cropped (Internet) • Skew

Steps + = Images Points: Points More points: Points Meshes: Meshes Models: Structure from

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Two-view Reconstruction keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Keypoints Detection keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Descriptor for each point SIFT descriptor keypoints match keypoints fundamental matrix essential matrix [R|t]

Same for the other images keypoints match keypoints SIFT descriptor fundamental matrix essential matrix

Point Match for correspondences keypoints match keypoints SIFT descriptor fundamental matrix essential matrix [R|t]

Fundamental Matrix Image 1 R 1, t 1 Image 2 R 2, t 2

Estimating Fundamental Matrix • Given a correspondence • The basic incidence relation is Need

Estimating Fundamental Matrix for 8 point correspondences: Direct Linear Transformation (DLT)

RANSAC to Estimate Fundamental Matrix • For many times – Pick 8 points –

Fundamental Matrix Essential Matrix Image 1 R 1, t 1 Image 2 R 2,

Essential Matrix For a given essential matrix and the first camera matrix , there

Triangulation Image 1 R 1, t 1 Image 2 R 2, t 2

Merge Two Point Cloud There can be only one

Bundle Adjustment A valid solution must let Re-projection = Observation and

Bundle Adjustment A valid solution and must let the Re-projection close to the Observation,

Bundle Adjustment Camera A valid solution and must let the Re-projection close to the

Initialization Matters • Input: Observed 2 D image position • Output: Unknown Camera Parameters

Descriptor: ZNCC (Zero-mean Normalized Cross-Correlation) • • • Invariant to linear radiometric changes More

Matching Propagation (propagate. m) • Maintain a priority queue Q • Initialize: Put all

Matchable Area the area with maximal gradience > threshold

Slides: 86

Download presentation

Shape(Structure) From X 解决的是从2 D图像到 2. 5 D表面形状(场景深度) 的问题 Shape from motion Shape from stereo Shape from monocular cues(shading, vanishing point, defocus, texture, …. )

(1) 根据基本矩阵估计运动 1. 计算基本矩阵 In reality, instead of solving to minimize , we seek E , least eigenvector of.

8 -point algorithm To enforce that E is of rank 2, E is replaced by E’ that minimizes subject to • It is achieved by SVD. Let , let then is the solution. . , where

8 -point algorithm % Build the constraint matrix A = [x 2(1, : )‘. *x 1(1, : )' x 2(1, : )'. *x 1(2, : )' x 2(1, : )'. . . x 2(2, : )'. *x 1(1, : )' x 2(2, : )'. *x 1(2, : )' x 2(2, : )'. . . x 1(1, : )' x 1(2, : )' ones(npts, 1) ]; [U, D, V] = svd(A); % Extract fundamental matrix from the column of V % corresponding to the smallest singular value. E = reshape(V(: , 9), 3, 3)'; % Enforce rank 2 constraint [U, D, V] = svd(E); E = U*diag([D(1, 1) D(2, 2) 0])*V';

Problem with 8 -point algorithm ~10000 ~100 ~10 0 0 Orders of magnitude difference between column of data matrix ! least-squares yields poor results 1

Normalized 8 -point algorithm normalized least squares yields good results Transform image to ~[-1, 1]x[-1, 1] (0, 500) (700, 500) (-1, 1) (0, 0) (700, 0) (-1, -1) (1, -1)

Normalized 8 -point algorithm [x 1, T 1] = normalise 2 dpts(x 1); [x 2, T 2] = normalise 2 dpts(x 2); A = [x 2(1, : )‘. *x 1(1, : )' x 2(1, : )'. *x 1(2, : )' x 2(1, : )'. . . x 2(2, : )'. *x 1(1, : )' x 2(2, : )'. *x 1(2, : )' x 2(2, : )'. . . x 1(1, : )' x 1(2, : )' ones(npts, 1) ]; [U, D, V] = svd(A); E = reshape(V(: , 9), 3, 3)'; [U, D, V] = svd(E); E = U*diag([D(1, 1) D(2, 2) 0])*V'; % Denormalise E = T 2'*E*T 1;

Normalization function [newpts, T] = normalise 2 dpts(pts) c = mean(pts(1: 2, : )')'; % Centroid newp(1, : ) = pts(1, : )-c(1); % Shift origin to centroid. newp(2, : ) = pts(2, : )-c(2); meandist = mean(sqrt(newp(1, : ). ^2 + newp(2, : ). ^2)); scale = sqrt(2)/meandist; T = [scale 0 -scale*c(1) 0 scale -scale*c(2) 0 0 1 ]; newpts = T*pts;

RANSAC repeat select minimal sample (8 matches) compute solution(s) for F determine inliers until (#inliers, #samples)<95% || too many times compute E based on all inliers

Structure from motion

Structure from motion Unknown camera viewpoints structure for motion: automatic recovery of camera motion and scene structure from two or more images. It is a self calibration technique and called automatic camera tracking or matchmoving.

坐标转换 Model-view Transformation Camera Coordinate System World Coordinate System

世界坐标系相机坐标系 Camera Parameter Camera Projection Matrix Intrinsic Extrinsic

Image 1 Image 3 R 3, t 3 R 1, t 1 Image 2 R 2, t 2

Point 1 Point 2 Point 3 Image 1 Image 2 Image 3 Same Camera Same Setting = Same

Triangulation Image 1 Image 3 R 3, t 3 R 1, t 1 Image 2 R 2, t 2

相机内部参数矩阵 • Principle point offset – especially when images are cropped (Internet) • Skew • Radial distortion (due to optics of the lens)

Steps + = Images Points: Points More points: Points Meshes: Meshes Models: Structure from Motion Multiple View Stereo Model Fitting Texture Mapping Images Models: Image-based Modeling

Steps + = Images Points: Points More points: Points Meshes: Meshes Models: Structure from Motion Multiple View Stereo Model Fitting Texture Mapping Images Models: Image-based Modeling + + =

Steps + = Images Points: Points More points: Points Meshes: Meshes Models: Structure from Motion Multiple View Stereo Model Fitting Texture Mapping Images Models: Image-based Modeling + + + =

Steps + = Images Points: Points More points: Points Meshes: Meshes Models: Structure from Motion Multiple View Stereo Model Fitting Texture Mapping Images Models: Image-based Modeling + + =

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Two-view Reconstruction

Two-view Reconstruction keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Keypoints Detection keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Descriptor for each point SIFT descriptor keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Same for the other images keypoints match keypoints SIFT descriptor fundamental matrix essential matrix [R|t] triangulation

Point Match for correspondences keypoints match keypoints SIFT descriptor fundamental matrix essential matrix [R|t] triangulation

Fundamental Matrix Image 1 R 1, t 1 Image 2 R 2, t 2

Estimating Fundamental Matrix • Given a correspondence • The basic incidence relation is Need 8 points

Estimating Fundamental Matrix for 8 point correspondences: Direct Linear Transformation (DLT)

RANSAC to Estimate Fundamental Matrix • For many times – Pick 8 points – Compute a solution for – Count number of inliers using these 8 points • Pick the one with the largest number of inliers

Fundamental Matrix Essential Matrix Image 1 R 1, t 1 Image 2 R 2, t 2

Essential Matrix For a given essential matrix and the first camera matrix , there are four possible choices for the second camera matrix :

Four Possible Solutions

Triangulation Image 1 R 1, t 1 Image 2 R 2, t 2

Two-view Reconstruction keypoints match keypoints fundamental matrix essential matrix [R|t] triangulation

Pipeline Structure from Motion (SFM) Multi-view Stereo (MVS)

Pipeline Taught Next

Merge Two Point Cloud

Merge Two Point Cloud There can be only one

Merge Two Point Cloud

Oops See From a Different Angle

Bundle Adjustment

Point 1 Image 2 Image 3 Point 2 Point 3

Bundle Adjustment A valid solution must let Re-projection = Observation and

Bundle Adjustment A valid solution and must let the Re-projection close to the Observation, i. e. to minimize the reprojection error

Bundle Adjustment Camera A valid solution and must let the Re-projection close to the Observation, i. e. to minimize the reprojection error Points

Initialization Matters • Input: Observed 2 D image position • Output: Unknown Camera Parameters (with some guess) Unknown Point 3 D coordinate (with some guess)

Descriptor: ZNCC (Zero-mean Normalized Cross-Correlation) • • • Invariant to linear radiometric changes More conservative than others such as sum of absolute or square diﬀerences in uniform regions More tolerant in textured areas where noise might be important

Seed for propagation

Matching Propagation (propagate. m) • Maintain a priority queue Q • Initialize: Put all seeds into Q with their ZNCC values as scores • For each iteration: – Pop the match with best ZNCC score from Q – Add new potential matches in their immediate spatial neighborhood into Q • Safety: handle uniqueness, and propagate only on matchable area

Matchable Area the area with maximal gradience > threshold

Result (dense. Math/run. m)

Triangulation Image 1 Image 3 R 3, t 3 R 1, t 1 Image 2 R 2, t 2

Final Result

Colorize the Point Cloud