GENERALIZED DISTANCE TRANSFORM A linear time algorithm and

  • Slides: 37
Download presentation
GENERALIZED DISTANCE TRANSFORM A linear time algorithm and its application in fitting articulated body

GENERALIZED DISTANCE TRANSFORM A linear time algorithm and its application in fitting articulated body models

OUTLINE n Distance Transform ¨ Generalized Distance Transform ¨ Linear time algorithm for Euclidean

OUTLINE n Distance Transform ¨ Generalized Distance Transform ¨ Linear time algorithm for Euclidean distance ¨ Other distances n Application of GDT ¨ Efficient matching of articulated body models

DISTANCE TRANSFORM Defined for a set of points P on a grid G, with

DISTANCE TRANSFORM Defined for a set of points P on a grid G, with P a subset of G G p q

EXAMPLE Example: G p q

EXAMPLE Example: G p q

EXAMPLES n Chamfer Hausdorff Hough n Often used in binary (edge) image matching n

EXAMPLES n Chamfer Hausdorff Hough n Often used in binary (edge) image matching n n

GENERALIZED DISTANCE TRANSFORM Instead of binary indicator function 1(q), we can assign a “soft”

GENERALIZED DISTANCE TRANSFORM Instead of binary indicator function 1(q), we can assign a “soft” membership of all grid elements to P f(q) is sampled on the grid G f(q) does not have to be a 2 D image, it can represent any D-dimensional, discrete space that encodes spatial relationships through d(p, q)

APPLICATIONS OF GDT n Feature matching / tracking ¨ n f(q) can represent a

APPLICATIONS OF GDT n Feature matching / tracking ¨ n f(q) can represent a D-dimensional feature vector at location q, and d(p, q) is a displacement in the image space Dynamic Programming / stereo matching ¨ f(q) can represent the accumulated cost of coming to state p, and d(p, q) is a transition cost to move from state p to state q f’(q) = b(q) + minp(f(p) + d(p, q)) n Belief Propagation / MRFs ¨ Max product (negative log) m’j i(xi) = minxj( ’j(xj) + ’ji(xj-xi) + k N(j)im’k j(xj))

WHY SO SLOW? n Generalized DT computes for each grid point p the distance

WHY SO SLOW? n Generalized DT computes for each grid point p the distance to all other grid points q n Its complexity is O(n*n) in the number of grid locations n n Intractable for problems with large number of discrete locations

MIN CONVOLUTION Speed-up by seeing DT as Min-Convolution

MIN CONVOLUTION Speed-up by seeing DT as Min-Convolution

LOWER ENVELOPE f(q) n n Min Convolution is the Lower Envelop of cones placed

LOWER ENVELOPE f(q) n n Min Convolution is the Lower Envelop of cones placed at each p Example 1 ¨ One 0 1 2 3 q f(q) Dimension ¨ Euclidean Distance f(2) ¨ f(3) f(0) f(1) Remember: in the case of standard distance transforms all cones would either be rooted at zero (when there is a pixel) or at infinity (when there is no pixel)

LOWER ENVELOPE n Example 2 ¨ One Dimension ¨ Squared Euclidean ¨ n Once

LOWER ENVELOPE n Example 2 ¨ One Dimension ¨ Squared Euclidean ¨ n Once computed, the distance transform on the grid can be sampled from the lower envelope in linear time

COMPUTING THE LOWER ENVELOPE Add parabola at first grid point q

COMPUTING THE LOWER ENVELOPE Add parabola at first grid point q

COMPUTING THE LOWER ENVELOPE Add second parabola at second grid point, and compute intersection

COMPUTING THE LOWER ENVELOPE Add second parabola at second grid point, and compute intersection with previous parabola v[1] q s

COMPUTING THE LOWER ENVELOPE Insert height and intersection point in arrays v and z

COMPUTING THE LOWER ENVELOPE Insert height and intersection point in arrays v and z v[1] v[2] z[2]

COMPUTING THE LOWER ENVELOPE Add third parabola at third grid point, and compute intersection

COMPUTING THE LOWER ENVELOPE Add third parabola at third grid point, and compute intersection with previous parabola v[1] v[2] q z[2] s

COMPUTING THE LOWER ENVELOPE v[1] v[2] v[3] z[2] z[3] Since the new intersection is

COMPUTING THE LOWER ENVELOPE v[1] v[2] v[3] z[2] z[3] Since the new intersection is to the right of the previous intersection, insert height and intersection point in arrays v and z

COMPUTING THE LOWER ENVELOPE Now consider the case when the new intersection is to

COMPUTING THE LOWER ENVELOPE Now consider the case when the new intersection is to the left of the previous intersection v[1] v[2] q s z[2]

COMPUTING THE LOWER ENVELOPE v[1] s q Delete previous parabola and its intersection from

COMPUTING THE LOWER ENVELOPE v[1] s q Delete previous parabola and its intersection from arrays v and z and compute intersection with the last parabola in array v

COMPUTING THE LOWER ENVELOPE Now insert height and intersection point in arrays v and

COMPUTING THE LOWER ENVELOPE Now insert height and intersection point in arrays v and z v[1] z[2] v[2]

COMPUTATIONAL COMPLEXITY n The algorithm has two steps ¨ 1) n Compute Lower Envelope

COMPUTATIONAL COMPLEXITY n The algorithm has two steps ¨ 1) n Compute Lower Envelope For each grid location: One insertion for parabola and intersection point ¨ At most one deletion of parabola and intersection point ¨ n ¨ 2) n Hence, O(n) for n grid locations Sample from Lower Envelope O(n) So, total complexity of O(n) !

ARBITRARY DIMENSIONS n Consider 2 D grid: is the one-dimensional DT along the column

ARBITRARY DIMENSIONS n Consider 2 D grid: is the one-dimensional DT along the column indexed by x’ n Any d-dimensional DT can be performed as d onedimensional distance transforms in O(dn) time

2 D EXAMPLE

2 D EXAMPLE

OTHER DISTANCES n So far only Euclidean distances shown n Other distances realized as

OTHER DISTANCES n So far only Euclidean distances shown n Other distances realized as a combination of linear, quadratic and box distances ¨ Min of any constant number of linear and quadratic functions, with or without truncation n ¨ E. g. , multiple “segments” Gaussian approximation with four min convolutions using box distances

ILLUSTRATIVE RESULTS Borrowed from Dan Huttenlocher n Image restoration using MRF formulation with truncated

ILLUSTRATIVE RESULTS Borrowed from Dan Huttenlocher n Image restoration using MRF formulation with truncated quadratic clique potentials ¨ n Fast quadratic min convolution technique makes feasible ¨ n Simply not practical with conventional techniques, message updates 2562 A multi-grid technique can speed up further Powerful formulation largely abandoned for such problems

Illustrative Results n Borrowed from Dan Huttenlocher Pose detection and object recognition Sites are

Illustrative Results n Borrowed from Dan Huttenlocher Pose detection and object recognition Sites are parts of an articulated object such as limbs of a person ¨ Labels are locations of each part in the image ¨ n ¨ Millions of labels, conventional quadratic time methods do not apply Compatibilities are spring-like

FITTING OF HUMAN BODY MODELS

FITTING OF HUMAN BODY MODELS

THE GENERAL APPROACH n Body parts model appearance n Graph models deformation of linked

THE GENERAL APPROACH n Body parts model appearance n Graph models deformation of linked limbs G=(V, E) with V set of part vertices, E set of edges connecting vertices n The best fit minimizes the sum of match cost of each limb and deformation cost of body structure best configuration match cost deformation cost

DYNAMIC PROGRAMMING n If Graph has tree-structure we can reformulate in recursive form ->

DYNAMIC PROGRAMMING n If Graph has tree-structure we can reformulate in recursive form -> Dynamic Programming (DP) n DP is appealing because it gives a global solution (on a discretized search space) n However, DP runs in polynomial time O(h 2 n), with n the number of parts and h the number of possible locations for each part n h usually is huge, often hundreds of thousands (x, y, s, θ) If each of (x, y, s, θ) has 20 discreet states, then we have h=160000 !!!

DP FOR TREE-STRUCTURED MODELS n Match quality for leaf nodes n Match quality for

DP FOR TREE-STRUCTURED MODELS n Match quality for leaf nodes n Match quality for other nodes n Best location for root node

MATCH COST AS DISTANCE TRANSFORM n Recall Generalized Distance Transform n Compare to match

MATCH COST AS DISTANCE TRANSFORM n Recall Generalized Distance Transform n Compare to match cost function Need to transform lj into regular grid for which dij serves as distance measure

ORIGINAL BODY CONFIGURATION n Locations of two connected parts n Joint probability of both

ORIGINAL BODY CONFIGURATION n Locations of two connected parts n Joint probability of both parts given deformation constraints

TRANSFORMED BODY CONFIGURATION n Project distribution over angles onto 2 D unit vector representation

TRANSFORMED BODY CONFIGURATION n Project distribution over angles onto 2 D unit vector representation n Now all parameters are in a grid and modeled as multivariate Gaussian with zero mean and variances specified in diagonal covariance matrix Dij n Distance in grid is given as Mahalanobis distance Dij over transformed joint locations Tij(li) and Tji(lj)

SUMMARY n Now linear instead of quadratic time to compute match costs between child

SUMMARY n Now linear instead of quadratic time to compute match costs between child and parent limbs n Did not prune away search space (still global solution!) n Search space only got a little bigger (about four times) due to unit vector representation of limb orientation ¨ 32 discreet angles represented in 11 x 11 grid

REFERENCES n Daniel Huttenlocher ¨ n http: //www. cs. cornell. edu/~dph/ Pedro Felzenszwalb ¨

REFERENCES n Daniel Huttenlocher ¨ n http: //www. cs. cornell. edu/~dph/ Pedro Felzenszwalb ¨ http: //people. cs. uchicago. edu/~pff/ n Distance Transforms of Sampled Functions. Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Cornell Computing and Information Science TR 2004 -1963. n Pictorial Structures for Object Recognition, Intl. Journal of Computer Vision, 61(1), pp. 55 -79, January 2005 (Daniel P. Huttenlocher, P. Felzenszwalb).

OTHER REFERENCES n Stereo & Image Restoration ¨ n Efficient Belief Propagation for Early

OTHER REFERENCES n Stereo & Image Restoration ¨ n Efficient Belief Propagation for Early Vision. Pedro F. Felzenszwalb and Daniel P. Huttenlocher. International Journal of Computer Vision, Vol. 70, No. 1, October 2006. Higher Order Markov Random Fields Efficient Belief Propagation with Learned Higher-Order Markov Random Fields, Proceedings of ECCV, 2006 (D. Huttenlocher, X. Lan, S. Roth and M. Black). ¨ www. cs. ubc. ca/~nando/nipsfast/slides/dt-nips 04. pdf ¨ n Image Segmentation ¨ Efficient Graph-Based Image Segmentation. Pedro F. Felzenszwalb and Daniel P. Huttenlocher. International Journal of Computer Vision, Volume 59, Number 2, September 2004.

Thanks!

Thanks!

MATCH COST AS DISTANCE TRANSFORM n Distance p(x, y) in grid is given as

MATCH COST AS DISTANCE TRANSFORM n Distance p(x, y) in grid is given as Mahalanobis distance Mij over model deformation parameters lj=(x, y, s, θ)T