Object Recognizing Object Classes Individual Recognition Object parts
Object Recognizing
Object Classes
Individual Recognition
Object parts Full Interpretation Window Mirror Window Door knob Headlight Back wheel Bumper Headlight Front wheel
Action recognition (except 2)
Class Non-class
Class Non-class
Is this an airplane?
Features and Classifiers Same features with different classifiers Same classifier with different features
Generic Features Simple (wavelets) Complex (Geons)
Marr-Nishihara
Mental Rotation
3 -D Parts • Implementations – poor results • View-specific recognition • f. MRI studies • Instead: Using image patches
Class-specific Features: Common Building Blocks
Optimal Class Components? • Large features are too rare • Small features are found everywhere Find features that carry the highest amount of information
Mutual information H(C) F=1 H(C) when F=1 F=0 H(C) when F=0 I(C; F) = H(C) – H(C/F)
Mutual Information I(C, F) Class: 1 1 0 1 0 0 Feature: 1 0 0 1 1 1 0 0 I(F, C) = H(C) – H(C|F)
Horse-class features Car-class features Pictorial features Learned from examples
Star model Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme
Recognition Features in the Brain
f. MRI Functional Magnetic Resonance Imaging
LO object recognition V 1 early processing
Class-fragments and Activation Malach et al 2008
Bag of words
Bag of visual words A large collection of image patches –
Each class has its words historgram – – – Limited or no Geometry Simple and popular Visual words are used, but not for full recognition model
Ho. G Descriptor Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection
• • • SIFT: Scale-invariant Feature Transform MSER: Maximally Stable Extremal Regions SURF: Speeded-up Robust Features Cross correlation …. • Ho. G and SIFT are the most widely used.
DPM Felzenszwalb • Felzenszwalb, Mc. Allester, Ramanan CVPR 2008. A Discriminatively Trained, Multiscale, Deformable Part Model • Many implementation details, will describe the main points.
Ho. G descriptor
Using patches with Ho. G descriptors and classification by SVM Person model: Ho. G
Object model using Ho. G A bicycle and its ‘root filter’ The root filter is a patch of Ho. G descriptor Image is partitioned into 8 x 8 pixel cells In each block we compute a histogram of gradient orientations
Dealing with scale: multi-scale analysis The filter is searched on a pyramid of Ho. G descriptors, to deal with unknown scale
Adding Parts A part Pi = (Fi, vi, si, ai, bi). Fi is filter for the i-th part, vi is the center for a box of possible positions for part i relative to the root position, si the size of this box ai and bi are two-dimensional vectors specifying coefficients of a quadratic function measuring a score for each possible placement of the i-th part. That is, ai and bi are two numbers each, and the penalty for deviation ∆x, ∆y from the expected location is a 1 ∆ x + a 2 ∆y + b 1 ∆x 2 + b 2 ∆y 2
Bicycle model: root, parts, spatial map Person model
- Slides: 38