3 D Computer Vision 3 D Vision and

  • Slides: 48
Download presentation
3 D Computer Vision 3 D Vision and Video Computing CSC I 6716 Fall

3 D Computer Vision 3 D Vision and Video Computing CSC I 6716 Fall 2009 Topic 1 of Part II Camera Models Zhigang Zhu, City College of New York zhu@cs. ccny. cuny. edu

3 D Computer Vision and Video Computing n Closely Related Disciplines l l n

3 D Computer Vision and Video Computing n Closely Related Disciplines l l n Image Processing – images to mages Computer Graphics – models to images Computer Vision – images to models Photogrammetry – obtaining accurate measurements from images What is 3 -D ( three dimensional) Vision? l l l n 3 D Vision Motivation: making computers see (the 3 D world as humans do) Computer Vision: 2 D images to 3 D structure Applications : robotics / VR /Image-based rendering/ 3 D video Lectures on 3 -D Vision Fundamentals l l Camera Geometric Models (3 lectures) Camera Calibration (3 lectures) Stereo (4 lectures) Motion (4 lectures)

3 D Computer Vision and Video Computing n Geometric Projection of a Camera l

3 D Computer Vision and Video Computing n Geometric Projection of a Camera l l l n Pinhole camera model Perspective projection Weak-Perspective Projection Camera Parameters l l Intrinsic Parameters: define mapping from 3 D to 2 D Extrinsic parameters: define viewpoint and viewing direction n n Basic Vector and Matrix Operations, Rotation Camera Models Revisited l Linear Version of the Projection Transformation Equation n n Lecture Outline Perspective Camera Model Weak-Perspective Camera Model Affine Camera Model for Planes Summary

3 D Computer Vision and Video Computing n Camera Geometric Models l l l

3 D Computer Vision and Video Computing n Camera Geometric Models l l l n Lecture Assumptions Knowledge about 2 D and 3 D geometric transformations Linear algebra (vector, matrix) This lecture is only about geometry Goal Build up relation between 2 D images and 3 D scenes -3 D Graphics (rendering): from 3 D to 2 D -3 D Vision (stereo and motion): from 2 D to 3 D -Calibration: Determning the parameters for mapping

3 D Computer Vision and Video Computing Image Formation

3 D Computer Vision and Video Computing Image Formation

3 D Computer Vision Image Formation and Video Computing Light (Energy) Source Surface Imaging

3 D Computer Vision Image Formation and Video Computing Light (Energy) Source Surface Imaging Plane Camera: Spec & Pose 3 D Scene Pinhole Lens World Optics Sensor Signal 2 D Image

3 D Computer Vision and Video Computing n n n Pinhole Camera Model Pin-hole

3 D Computer Vision and Video Computing n n n Pinhole Camera Model Pin-hole is the basis for most graphics and vision l Derived from physical construction of early cameras l Mathematics is very straightforward 3 D World projected to 2 D Image l Image inverted, size reduced l Image is a 2 D plane: No direct depth information Perspective projection l f called the focal length of the lens l given image size, change f will change FOV and figure sizes

3 D Computer Vision Focal Length, FOV and Video Computing n Consider case with

3 D Computer Vision Focal Length, FOV and Video Computing n Consider case with object on the optical axis: Image plane f z viewpoint n n n Optical axis: the direction of imaging Image plane: a plane perpendicular to the optical axis Center of Projection (pinhole), focal point, viewpoint, nodal point Focal length: distance from focal point to the image plane FOV : Field of View – viewing angles in horizontal and vertical directions

3 D Computer Vision Focal Length, FOV and Video Computing n Consider case with

3 D Computer Vision Focal Length, FOV and Video Computing n Consider case with object on the optical axis: Image plane z f Out of view n n n Optical axis: the direction of imaging Image plane: a plane perpendicular to the optical axis Center of Projection (pinhole), focal point, viewpoint, , nodal point Focal length: distance from focal point to the image plane FOV : Field of View – viewing angles in horizontal and vertical directions Increasing f will enlarge figures, but decrease FOV

3 D Computer Vision Equivalent Geometry and Video Computing n Consider case with object

3 D Computer Vision Equivalent Geometry and Video Computing n Consider case with object on the optical axis: f z n More convenient with upright image: z f Projection plane z = f n Equivalent mathematically

3 D Computer Vision Perspective Projection and Video Computing n Compute the image coordinates

3 D Computer Vision Perspective Projection and Video Computing n Compute the image coordinates of p in terms of the world (camera) coordinates of P. y Y p(x, y) x P(X, Y, Z) X 0 Z Z=f n n n Origin of camera at center of projection Z axis along optical axis Image Plane at Z = f; x // X and y//Y

3 D Computer Vision and Video Computing n Reverse Projection Given a center of

3 D Computer Vision and Video Computing n Reverse Projection Given a center of projection and image coordinates of a point, it is not possible to recover the 3 D depth of the point from a single image. In general, at least two images of the same point taken from two different locations are required to recover depth.

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam : what do

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam : what do you see in this picture? lstraight line lsize lparallelism/angle lshape of planes ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line lsize

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line lsize lparallelism/angle lshape of planes ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size lparallelism/angle lshape of planes ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size ´parallelism/angle lshape of planes ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size ´parallelism/angle ´shape lshape of planes ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam üstraight line ´size ´parallelism/angle ´shape lshape ü of planes parallel to image ldepth Photo by Robert Kosara, robert@kosara. net http: //www. kosara. net/gallery/pinholeamsterdam/pic 01. html

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam: what do you

3 D Computer Vision and Video Computing Pinhole camera image Amsterdam: what do you see? üstraight line ´size ´parallelism/angle ´shape lshape ü of planes parallel to image l. Depth ? lstereo lmotion lsize lstructure … - We see spatial shapes rather than individual pixels - Knowledge: top-down vision belongs to human - Stereo &Motion most successful in 3 D CV & application - You can see it but you don't know how…

3 D Computer Vision and Video Computing Yet other pinhole camera images Rabbit or

3 D Computer Vision and Video Computing Yet other pinhole camera images Rabbit or Man? Markus Raetz, Metamorphose II, 1991 -92, cast iron, 15 1/4 x 12 inches Fine Art Center University Gallery, Sep 15 – Oct 26

3 D Computer Vision and Video Computing Yet other pinhole camera images 2 D

3 D Computer Vision and Video Computing Yet other pinhole camera images 2 D projections are not the “same” as the real object as we usually see everyday! Markus Raetz, Metamorphose II, 1991 -92, cast iron, 15 1/4 x 12 inches Fine Art Center University Gallery, Sep 15 – Oct 26

3 D Computer Vision and Video Computing It’s real!

3 D Computer Vision and Video Computing It’s real!

3 D Computer Vision and Video Computing n Weak Perspective Projection Average depth Z

3 D Computer Vision and Video Computing n Weak Perspective Projection Average depth Z is much larger than the relative distance between any two scene points measured along the optical axis y Y p(x, y) x P(X, Y, Z) X 0 Z Z=f n A sequence of two transformations l l n Orthographic projection : parallel rays Isotropic scaling : f/Z Linear Model l Preserve angles and shapes

3 D Computer Vision Camera Parameters and Video Computing xim Image frame (xim, yim)

3 D Computer Vision Camera Parameters and Video Computing xim Image frame (xim, yim) yim n Coordinate Systems l l n Frame Grabber Frame coordinates (xim, yim) pixels Image coordinates (x, y) in mm Camera coordinates (X, Y, Z) World coordinates (Xw, Yw, Zw) Camera Parameters l l Y Pose / Camera X O y x Z p Object / World Zw P Pw Xw Yw Intrinsic Parameters (of the camera and the frame grabber): link the frame coordinates of an image point with its corresponding camera coordinates Extrinsic parameters: define the location and orientation of the camera coordinate system with respect to the world coordinate system

3 D Computer Vision and Video Computing y x Y X p (x, y,

3 D Computer Vision and Video Computing y x Y X p (x, y, f) O n l l oy yim Image center Directions of axes Pixel size From 3 D to 2 D l n Size: (sx, sy) (0, 0) From image to frame l n Z Intrinsic Parameters (I) Perspective projection Intrinsic Parameters l l l (ox , oy) : image center (in pixels) (sx , sy) : effective size of the pixel (in mm) f: focal length ox xim Pixel (xim, yim)

3 D Computer Vision Intrinsic Parameters (II) and Video Computing (x, y) n n

3 D Computer Vision Intrinsic Parameters (II) and Video Computing (x, y) n n k 1 , k 2 (xd, yd) Lens Distortions Modeled as simple radial distortions l l r 2 = xd 2+yd 2 (xd , yd) distorted points k 1 , k 2: distortion coefficients A model with k 2 =0 is still accurate for a CCD sensor of 500 x 500 with ~5 pixels distortion on the outer boundary

3 D Computer Vision Extrinsic Parameters and Video Computing xim (xim, yim) yim n

3 D Computer Vision Extrinsic Parameters and Video Computing xim (xim, yim) yim n Y X O y x Z p From World to Camera Zw P n Extrinsic Parameters l l Pw Xw T Yw A 3 -D translation vector, T, describing the relative locations of the origins of the two coordinate systems (what’s it? ) A 3 x 3 rotation matrix, R, an orthogonal matrix that brings the corresponding axes of the two systems onto each other

3 D Computer Vision and Video Computing Linear n A point as a 2

3 D Computer Vision and Video Computing Linear n A point as a 2 D/ 3 D vector l l l n Algebra: Vector and Matrix T: Transpose Image point: 2 D vector Scene point: 3 D vector Translation: 3 D vector Vector Operations l Addition: n l Dot product ( a scalar): n l Translation of a 3 D vector a. b = |a||b|cosq Cross product (a vector) n Generates a new vector that is orthogonal to both of them a x b = (a 2 b 3 - a 3 b 2)i + (a 3 b 1 - a 1 b 3)j + (a 1 b 2 - a 2 b 1)k

3 D Computer Vision and Video Computing Linear n Rotation: 3 x 3 matrix

3 D Computer Vision and Video Computing Linear n Rotation: 3 x 3 matrix l Orthogonal : n l n Algebra: Vector and Matrix 9 elements => 3+3 constraints (orthogonal/cross ) => 2+2 constraints (unit vectors) => 3 DOF ? (degrees of freedom, orthogonal/dot) How to generate R from three angles? (next few slides) Matrix Operations l R Pw +T= ? - Points in the World are projected on three new axes (of the camera system) and translated to a new origin

3 D Computer Vision and Video Computing n Rotation: from Angles to Matrix Rotation

3 D Computer Vision and Video Computing n Rotation: from Angles to Matrix Rotation around the Axes l Result of three consecutive rotations around the coordinate axes Y O X Z g Zw n Notes: l l l Only three rotations Every time around one axis Bring corresponding axes to each other n l Xw = X, Yw = Y, Zw = Z First step (e. g. ) Bring Xw to X Xw b Yw a

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix g Zw

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix g Zw Y X O n l l n Yw Rotation g around the Zw Axis Rotate in Xw. OYw plane Goal: Bring Xw to X But X is not in Xw. OYw Xw Yw X X in Xw. OZw ( Yw Xw. OZw) Yw in YOZ ( X YOZ) Next time rotation around Yw Z

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix g Zw

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix g Zw Y Xw n Rotation g around the Zw Axis l l n Rotate in Xw. OYw plane so that Yw X X in Xw. OZw ( Yw Xw. OZw) Yw in YOZ ( X YOZ) Zw does not change X O Yw Z

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Zw Y

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Zw Y X Xw n Rotation b around the Yw Axis l l n O b Rotate in Xw. OZw plane so that Xw = X Zw in YOZ (& Yw in YOZ) Yw does not change Yw Z

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Y X

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Y X O n Rotation b around the Yw Axis l l n b Rotate in Xw. OZw plane so that Xw = X Zw in YOZ (& Yw in YOZ) Yw does not change Yw Xw Zw Z

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Y a

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Y a X O n Rotation a around the Xw(X) Axis l l n Rotate in Yw. OZw plane so that Yw = Y, Zw = Z (& Xw = X) Xw does not change Yw Xw Zw Z

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Yw Y

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Yw Y a X O Xw Zw Z n Rotation a around the Xw(X) Axis l l n Rotate in Yw. OZw plane so that Yw = Y, Zw = Z (& Xw = X) Xw does not change

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Appendix A.

3 D Computer Vision and Video Computing Rotation: from Angles to Matrix Appendix A. 9 of the textbook n Rotation around the Axes l Result of three consecutive rotations around the coordinate axes Y O X Z g n Zw Notes: l l l Rotation directions The order of multiplications matters: g, b, a Same R, 6 different sets of a, b, g R Non-linear function of a, b, g R is orthogonal It’s easy to compute angles from R Xw b Yw a

3 D Computer Vision and Video Computing Rotation- Axis and Angle Appendix A. 9

3 D Computer Vision and Video Computing Rotation- Axis and Angle Appendix A. 9 of the textbook n n According to Euler’s Theorem, any 3 D rotation can be described by a rotating angle, q, around an axis defined by an unit vector n = [n 1, n 2, n 3]T. Three degrees of freedom – why?

3 D Computer Vision Linear Version and Video Computing n World to Camera l

3 D Computer Vision Linear Version and Video Computing n World to Camera l l l n Camera to Image l l l n Camera: P = (X, Y, Z)T Image: p = (x, y)T Not linear equations Image to Frame l l n Camera: P = (X, Y, Z)T World: Pw = (Xw, Yw, Zw)T Transform: R, T Neglecting distortion Frame (xim, yim)T World to Frame l l (Xw, Yw, Zw)T -> (xim, yim)T Effective focal lengths n n fx = f/sx, fy=f/sy Three are not independent of Perspective Projection

3 D Computer Vision and Video Computing n Projective Space l Add fourth coordinate

3 D Computer Vision and Video Computing n Projective Space l Add fourth coordinate n l l Only extrinsic parameters World to camera 3 x 3 Matrix Mint l l n x 1/x 3 =xim, x 2/x 3 =yim 3 x 4 Matrix Mext l n Pw = (Xw, Yw, Zw, 1)T Define (x 1, x 2, x 3)T such that n n Linear Matrix Equation of perspective projection Only intrinsic parameters Camera to frame Simple Matrix Product! Projective Matrix M= Mint. Mext l l l (Xw, Yw, Zw)T -> (xim, yim)T Linear Transform from projective space to projective plane M defined up to a scale factor – 11 independent entries

3 D Computer Vision and Video Computing n Perspective Camera Model l Making some

3 D Computer Vision and Video Computing n Perspective Camera Model l Making some assumptions n n l n Known center: Ox = Oy = 0 Square pixel: Sx = Sy = 1 11 independent entries <-> 7 parameters Weak-Perspective Camera Model l Average Distance Z >> Range d. Z Define centroid vector Pw l 8 independent entries l n Three Camera Models Affine Camera Model l Mathematical Generalization of Weak-Pers Doesn’t correspond to physical camera But simple equation and appealing geometry n l Doesn’t preserve angle BUT parallelism 8 independent entries

3 D Computer Vision and Video Computing n Planes are very common in the

3 D Computer Vision and Video Computing n Planes are very common in the Man-Made World l n l l Zw=0 Pw =(Xw, Yw, 0, 1)T 3 D point -> 2 D point Projective Model of a Plane l n One more constraint for all points: Zw is a function of Xw and Yw Special case: Ground Plane l n Camera Models for a Plane 8 independent entries General Form ? l 8 independent entries

3 D Computer Vision and Video Computing n A Plane in the World l

3 D Computer Vision and Video Computing n A Plane in the World l n l l Zw=0 Pw =(Xw, Yw, 0, 1)T 3 D point -> 2 D point Projective Model of Zw=0 l n One more constraint for all points: Zw is a function of Xw and Yw Special case: Ground Plane l n Camera Models for a Plane 8 independent entries General Form ? l 8 independent entries

3 D Computer Vision and Video Computing n A Plane in the World l

3 D Computer Vision and Video Computing n A Plane in the World l n l l Zw=0 Pw =(Xw, Yw, 0, 1)T 3 D point -> 2 D point Projective Model of Zw=0 l n One more constraint for all points: Zw is a function of Xw and Yw Special case: Ground Plane l n Camera Models for a Plane 8 independent entries General Form ? l nz = 1 l 8 independent entries n 2 D (xim, yim) -> 3 D (Xw, Yw, Zw) ?

3 D Computer Vision and Video Computing n Homework #3 online, due October 26

3 D Computer Vision and Video Computing n Homework #3 online, due October 26 before class l Projection equation of a plane

3 D Computer Vision and Video Computing n Graphics /Rendering l From 3 D

3 D Computer Vision and Video Computing n Graphics /Rendering l From 3 D world to 2 D image n n l n Changing viewpoints and directions Changing focal length Fast rendering algorithms Vision / Reconstruction l From 2 D image to 3 D model n n l l n Applications and Issues Inverse problem Much harder / unsolved Robust algorithms for matching and parameter estimation Need to estimate camera parameters first Calibration l l Find intrinsic & extrinsic parameters Given image-world point pairs Probably a partially solved problem ? 11 independent entries n <-> 10 parameters: fx, fy, ox, oy, a, b, g, Tx, Ty, Tz

3 D Computer Vision and Video Computing n Geometric Projection of a Camera l

3 D Computer Vision and Video Computing n Geometric Projection of a Camera l l l n Pinhole camera model Perspective projection Weak-Perspective Projection Camera Parameters (10 or 11) l l n Camera Model Summary Intrinsic Parameters: f, ox, oy, sx, sy, k 1: 4 or 5 independent parameters Extrinsic parameters: R, T – 6 DOF (degrees of freedom) Linear Equations of Camera Models (without distortion) l l l General Projection Transformation Equation : 11 parameters Perspective Camera Model: 11 parameters Weak-Perspective Camera Model: 8 parameters Affine Camera Model: generalization of weak-perspective: 8 Projective transformation of planes: 8 parameters

3 D Computer Vision and Video Computing n Determining the value of the extrinsic

3 D Computer Vision and Video Computing n Determining the value of the extrinsic and intrinsic parameters of a camera Calibration (Ch. 6) Next