Gist A Mobile Robotics Application of ContextBased Vision

Gist: A Mobile Robotics Application of Context-Based Vision in Outdoor Environment Christian Siagian Laurent Itti Univ. Southern California, CA, USA

Outline Mobile robot localization n Biological approach to vision n Gist model n Testing and results n Discussion and conclusion n

Mobile Robot Localization n Where are we? n Localization = identifying landmarks

Mobile Robot Localization n Indoors: strong assumptions of flat walls, narrow hallways, and solid angles • Ranging sensors (laser and sonar) for mapping n Outdoors: less conforming set of surfaces • Ranging sensors are less effective, vision is better

Robot Vision Localization n Object-based Vision Localization n Objects as landmarks

Robot Vision Localization n Region-based Vision Localization n regions as landmarks

Robot Vision Localization n Scene-based Vision Localization n Scenes as a whole as Landmarks n n Color histograms [Ulrich and Nourbakhsh 2000] Fourier Transform [Oliva & Torralba 2001] Wavelet pyramids [Torralba 2003] Histogram of Dominant features [Renniger & Malik 2004]

Gist n Definition and background n n n Essence, holistic characteristics of an image Context information obtained within a eye saccade (app. 150 ms. ) Evidence of place recognizing cells at Parahippocampal Place Area (PPA) Biologically plausible models of Gist are yet to be proposed Nature of tasks done with gist n n n Scene categorization/context recognition Region priming/layout recognition Resolution/scale selection

Human Vision Architecture n Visual Cortex: n n Saliency Model: n n Attend to pertinent regions Gist Model: n n Low level filters, center -surround, and normalization Compute image general characteristics High Level Vision: n n n Object recognition Layout recognition Scene understanding

Gist Model n Utilize the same Visual Cortex raw features in the saliency model [Itti 2001] n n Gist is theoretically non-redundant with Saliency Gist vs. Saliency n n n Instead of looking at most conspicuous locations in image, looks at scene as a whole Detection of regularities, not irregularities Cooperation (Accumulation) vs. competition (WTA) among locations More spatial emphasis in saliency Local vs. global/regional interaction

Gist Model Implementation n V 1 Raw image feature-Maps n n n Orientation Channel • Gabor filters at 4 angles (0, 45, 90, 135) on 4 scales = 16 sub-channels Color: • red-green and blue-yellow center surround each with 6 scale combinations = 12 sub-channels Intensity • dark-bright center-surround with 6 scale combinations = 6 sub-channels = Total of 34 sub-channels

Gist Model Implementation n Gist Feature Extraction n Average values of predetermined grid

Gist Model Implementation n Dimension Reduction n n Original: 34 sub-channels x 16 features = 544 features PCA/ICA reduction: 80 features • Kept >95% of variance

Gist Model Implementation n Dimension Reduction n n Original: 34 sub-channels x 16 features = 544 features PCA/ICA reduction: 80 features • Kept >95% of variance n Place Classification n Three-layer neural networks

System Example Run

Testing & Results n Site selection: Different challenges appearance-wise n Variability in area covered/ path lengths n Various lighting conditions n Single-view filming n Clean break between segments n Scalability: combine all sites n

Map of Experiment Sites

Site 1: Building Complex

Site 1 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 1 Results Output Label Assigned Label

Site 2: Vegetation-filled Park

Site 2 Result Output Label Assigned Label

Site 2 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 3: Open Field Park

Site 3 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 3 Result Output Label Assigned Label

Combined Sites Result

Discussion & Conclusion n Result of current model: n n Success rate between 82. 48% and 87. 93% Combined rate of 85. 96% 4. 73% error in inter-site classification Integrating saliency for robot navigation n Localization within segment • Identifying discriminating cues in the environment • Issues in object-based systems still applies n Bad view detection • Foreground objects sometimes occlude whole view n Obstacle avoidance, exploration, etc.

Discussion n Integration of gist and saliency in general n n Single representation of both models Influence of saliency to gist and vice versa • Involvement of saliency in improving gist estimation • Gist helpful in identifying/filtering salient location n Testing the limits of Gist: psychophysics experiments • Change blindness test for large scale layout changes • Varying exposure time • Isolation of bottom up - top down influences