Image Parsing Unifying Segmentation and Detection Z Tu

Outline • • Why Image Parsing? Introduction to Concepts in DDMCMC applied to Image

Image Parsing Optimize p(W|I) Image I Parse Structure W

Properties of Parse Structure • Dynamic and reconfigurable – Variable number of nodes and

Key Concepts • Joint model for Segmentation & Recognition – Combine different modules to

Pattern Classes 62 characters Faces Regions

MCMC: A Quick Tour • Key Concepts: – Markov Chains – Markov Chain Monte

Markov Chains Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Markov Chain Monte Carlo Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Metropolis-Hastings Algorithm Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Metropolis-Hastings Algorithm Invariant Distribution Proposal Distribution Notes: Slides by Zhu, Dellaert and Tu at

Reversible Jumps MCMC • Many competing models to explain data – Need to explore

DDMCMC Motivation Unifies Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

DDMCMC Motivation Generative Model p(I|W)p(W) State Space

DDMCMC Motivation Generative Model p(I|W)p(W) State Space Discriminative Model q( wj | I )

DDMCMC Framework • Moves: – Node Creation – Node Deletion – Change Node Attributes

Transition Kernel Satisfies detailed balanced equation Full Transition Kernel

Convergence to p(W|I) Monotonically at a geometric rate

Criteria for Designing Transition Kernels

Image Generation Model Regions: Constant Intensity Textures Shading State of parse graph

Designed to penalize high model complexity Uniform

Discriminative Cues Used • Adaboost Trained – Face Detector – Text Detector • Adaptive

Possible Transitions 1. 2. 3. 4. 5. Birth/Death of a Face Node Birth/Death of

Slides: 38

Download presentation

Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A. L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty

Outline • • Why Image Parsing? Introduction to Concepts in DDMCMC applied to Image Parsing Combining Discriminative and Generative Models for Parsing • Results • Comments

Image Parsing Optimize p(W|I) Image I Parse Structure W

Properties of Parse Structure • Dynamic and reconfigurable – Variable number of nodes and node types • Defined by a Markov Chain – Data Driven Markov Chain Monte Carlo (earlier work in segmentation, grouping and recognition)

Key Concepts • Joint model for Segmentation & Recognition – Combine different modules to obtain cues • Fully generative explanation for Image generation – Uses Generative and Discriminative Models + DDMCMC framework – Concurrent Top-Down & Bottom-Up Parsing

Pattern Classes 62 characters Faces Regions

MCMC: A Quick Tour • Key Concepts: – Markov Chains – Markov Chain Monte Carlo • Metropolis-Hastings [Metropolis 1953, Hastings 1970] • Reversible Jump [Green 1995] – Data Driven Markov Chain Monte Carlo

Markov Chains Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Markov Chain Monte Carlo Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Metropolis-Hastings Algorithm Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Metropolis-Hastings Algorithm Invariant Distribution Proposal Distribution Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Reversible Jumps MCMC • Many competing models to explain data – Need to explore this complicated state space Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

DDMCMC Motivation Unifies Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

DDMCMC Motivation Generative Model p(I|W)p(W) State Space

DDMCMC Motivation Generative Model p(I|W)p(W) State Space Discriminative Model q( wj | I ) Dramatically reduce search space by focusing sampling to highly probable states.

DDMCMC Framework • Moves: – Node Creation – Node Deletion – Change Node Attributes

Transition Kernel Satisfies detailed balanced equation Full Transition Kernel

Convergence to p(W|I) Monotonically at a geometric rate

Criteria for Designing Transition Kernels

Image Generation Model Regions: Constant Intensity Textures Shading State of parse graph

62 characters Faces 3 Regions

Designed to penalize high model complexity Uniform

Shape Prior Faces 3 Regions

Shape Prior: Text

Intensity Models

Intensity Model: Faces

Discriminative Cues Used • Adaboost Trained – Face Detector – Text Detector • Adaptive Binarization Cues • Edge Cues – Canny at 3 scales • Shape Affinity Cues • Region Affinity Cues

Transition Kernel Design • Remember

Possible Transitions 1. 2. 3. 4. 5. Birth/Death of a Face Node Birth/Death of Text Node Boundary Evolution Split/Merge Region Change node attributes

Face/Text Transitions

Region Transitions

Change Node Attributes

Basic Control Algorithm

Results

Comments

Thank You