Computational Vision CSCI 363 Spring 2021 Lecture 28
















- Slides: 16

Computational Vision CSCI 363, Spring 2021 Lecture 28 Structure from motion 1

Announcements • • Read Ch. 10. 2. 1 – 10. 2. 2 Project proposals due Now Exam 4 will be on Monday, April 26 Exam 3: High Score: 48 (96%) • Median Score: 42 (84%) • 2

Evidence that MT processes motion 1. Cells in MT prefer moving stimuli to static stimuli. 2. Lesions of MT cause loss of ability to discriminate motion direction: Newsome et al. performed an experiment to test this in monkeys. Stimulus: Moving dots--Some percentage move in a coherent direction (correlated dots), the rest move in random directions (noise) Task: Judge direction of motion (e. g. up vs. down). Measure: Percent correlation needed to discriminate the directions of motion. Result: After lesion of MT, monkeys require a greater percentage of correlated dots to make the discrimination (i. e. they were worse at the motion task). 3

Microstimulation experiments Another piece of evidence for MT being involved in motion comes from experiments in which MT cells are electrically stimulated with a micro-electrode (microstimulation). Salzman & Newsome (1994) showed that they could influence a monkey's perception of motion by stimulation of cells in MT. 4

2 D Motion is just the Beginning 2 D image motion contains information about: • Relative depth of surfaces • 3 D motion of objects • 3 D structure of objects • Direction of observer motion Among other things. 5

Structure From Motion Structure from motion originally studied rigorously by Wallach and O'Connell (1953). They studied wire-frame objects and examined peoples ability to judge the structure of the objects when moving. The ability to see a 3 D structure from a moving 2 D image is known as the Kinetic Depth Effect. Demo: https: //www. michaelbach. de/ot/mot_ske/index. html 6

Inherent Ambiguity How do we compute a 3 D motion from a 2 D image motion? Given the 2 D image motion, there are multiple possible 3 D motions that could have generated it: ? Image plane v To solve for 3 D motion from 2 D image information, we must use a constraint. The Rigidity constraint assumes that the object is rigid. 7

The Rigidity Constraint To solve for 3 D motion from 2 D image information, we must use a constraint. The Rigidity constraint assumes that the object is rigid. Ullman showed that, for an orthographic projection system, one can compute the 3 D structure of an object given: 3 distinct views of 4 non-coplanar points in motion. If there exists a rigid 3 D structure consistent with these views, then it is unique. The 2 D positions of the points generate a set of equations that can be solved for the 3 D structure. 8

The problem with noise For perspective projection, one needs 2 views and 7 points. If working with velocity fields, you need 5 points and 1 view (for perspective projection). Problem: This approach is very sensitive to noise in the velocity estimates. This approach does not allow interpretation of non-rigid motions. 9

Information about Human ability The human visual system takes some time to recover the 3 D structure: It is not instantaneous. Humans can cope with significant deviations from rigidity. The visual system integrates multiple sources of information (e. g. static depth cues) to determine 3 D structure. 10

Incremental Rigidity Scheme The basic idea: 1) At each instant, generate an estimate of the 3 D structure. 2) The recovery process prefers rigid transformations, but rigidity is not required. 3) The scheme can tolerate deviations from rigidity. 4) It should be able to integrate information over an extended viewing period. 5) It should eventually recover the correct 3 D structure (or a close approximation). (x 1, y 1, z 1) -> (x 1', y 1', ? ) (x 2, y 2, z 2) -> (x 2', y 2', ? ) (x 3, y 3, z 3) -> (x 3', y 3', ? ) y x Initially set z to zero. 11

Maximizing Rigidity We want to find the 3 D model, S'(t), that maximizes the rigidity. Therefore, we want to find new Z values that minimize the change in structure Current model Lij Model update lij Image Find zi to minimize: The denominator makes sure that distant points count less, as they are less likely to be rigidly connected to each other. 12

Performance The incremental rigidity scheme works fairly well for rigid motions. For a rotating object, it can find the 3 D shape within about 2 -5% error after a few rotations. Mirror reversals may occur sometimes during the computations. (This may arise from orthographic projection). The performance is similar to human structure from motion. 0 o 90 o z 180 o 360 o Results of scheme after various rotations of object. x 720 o 1440 o 13

Properties of Incremental Rigidity Scheme 1. Veridicality--Usually get a reasonable approximation of the true structure. 2. Temporal extension--The time required is longer than for the 4 points, 3 views method. 3. There is some residual non-rigidity in computed 3 D structure. 4. The improvement over time is non-monotonic. There may be some increase in error at times, before it decreases again. 5. Depth reversals--Sometimes this method exhibits spontaneous depth reversals in the estimated 3 D structure. 6. May converge to a local minimum that's not the most rigid interpretation. 7. It can be influenced by static depth cues. 8. It can track moderate amounts of non-rigid motion over time. 14 In many ways, this scheme behaves similarly to the human system.

Psychophysics of Structure from Motion Psychophysical results show that people can judge structurefrom-motion from motion of points with limited lifetimes. Demo of cylinder: 15

Psychophysics of Structure from Motion Psychophysical results show that people can judge structure-frommotion from motion of points with limited lifetimes. Stimulus: Rotating cylinder composed of dots. Dots have limited lifetimes (e. g. 50, 100, 150 msec) People can see 3 D structure of cylinder for point lifetimes as low as 125 msec. This is faster than the incremental rigidity scheme can handle. Hildreth and colleagues developed an extension of Ullman's incremental rigidity scheme, that uses velocity estimates instead of 16 measure point positions