Understanding conjunction and double feature searches by a

Understanding conjunction and double feature searches by a saliency map in primary visual cortex Li Zhaoping, Department of Psychology, University College London, z. li@ucl. ac. uk, www. gatsby. ucl. ac. uk/~zhaoping Conjunction search --orientation-color Double feature search -- orientationcolor The V 1 saliency map agrees with visual search behavior. V 1 produces a saliency map Single feature search --- color Single feature search -- orientation Target= Z=0. 8 Highlighting important image locations. These locations evoke stronger responses because they have fewer iso -orientation neighbors that suppress them and/or more co-linear neighbors that facilitate them. Target= Input to model Question: How much easier is a double feature search than the corresponding single feature searches, and how much easier are the single feature searches than the conjunction search? How do they depend on the underlying features? Target= Z=0. 25 The V 1 model is based on V 1 physiology and anatomy (e. g. , horizontal connections linking cells tuned to similar orientations), tested to be consistent with physiological data on contextual influences (e. g. , iso-orientation suppression, Knierim and van Essen (1992) colinear facilitation, Kapadia et al 1995). Original input V 1 response S S=0. 2, z=1. 0 Target= Z=3. 4 Histogram of all responses S regardless of features S=0. 4, z=7 V 1 processing Target= Z=0. 63, next to target, z=0. 68 S σ S=0. 12, z=-1. 3 S=0. 22, z=1. 7 Z = (S-S)/σ , z score, measuring saliencies of items Target= Z=-0. 83, next to target, z=3. 7 Saliency of an item is assumed to increase with its evoked V 1 response. We assume that efficiency of a visual search task increases with the salience of the target (or its most salient part, e. g. , the horizontal bar in the target cross above). The high z score, z = 7, (of the horizontal bar), a measure of the cross’ salience, enables the cross to pop out, since its evoked V 1 response (to the horizontal bar) is much higher than the average population response of the whole image. The cross has a unique feature, the horizontal bar, which evokes the highest response since it experiences no iso-orientation suppression while all distractors do. Hence, intra-cortical interaction is a neural basis for why feature searches are often efficient. Hence, on conjunction searches • A conjunction of 2 orientations is difficult to find since V 1 cells are not tuned to two different orientations that differ significantly from each other. Distractors irregularly placed Distractors dissimilar to each other Homogeneous background, identical distractors regularly placed Distractors irregularly placed Homogeneous background, identical distractors regularly placed Model behavior agrees with the subtle changes in search efficiency in asymmetries in visual search --- search efficiency change when target and distractors swap roles. Shown in 2 examples. Only input images are shown, output response differences are too small to be visualized here, but z score differences can be significant. V 1’s output as saliency map is viewed under the idealization of the top-down feedback to V 1 being disabled, e. g. , shortly after visual exposure or under anesthesia. Signaling saliency regardless of features: Contrary to common beliefs, this does not mean that the cells reporting salience must be un-tuned to specific features. In other words, here “regardless of” means the following — in this saliency map, the meaning of Conjunction search This is so even when a target has negative z score, because the items next to the target becomes mogeneous Search becomes easier in homogeneous backgrounds, more salient in a ho background, attracting attraction. since z increases with decreasing σ Observations: Motion-orientation and depth-orientation conjunctions are not much more difficult than the single feature searches (Nakayama and Silverman 1986, Mcleod et al 1988), Color-orientation conjunction search is more difficult (Treiman and Gelade 1980). Double feature advantage is greater in motion-orientation than colororientation (Nothdurft 2000) Target lacking a feature Z=-0. 9 Z=0. 22 Observations: Double feature searches are easier than the corresponding single feature searches, which in turn are easier than conjunction searches. Comments Outputs S to higher Visual Areas V 1 model Target differs from background in both color and orientation Model output Input images Target, Model outputs and its Z score Two neural substrates necessary to make a basic feature: (1) Tuning of cells’ receptive fields to feature, i. e. , a population of V 1 cells selective to different values of this feature dimension, such that the feature can be signaled, (2) tuning of the horizontal connections to feature, i. e. , selectivity of the horizontal intra-cortical connections to the optimal feature values of both the presynaptic and post-synaptic cells in this feature dimension, such that a lack of isofeature (e. g. , iso-orientation) suppression of the target can lead to a relatively higher response. E. g. , a vertical bar pops out among horizontal ones since cells are selective to orientation, and horizontal connections link cells tuned to similar orientations, hence responses to horizontal bars are suppressed due to isoorientation suppression. • A conjunction of motion-orientation (or depth-orientation) is easy to find since many V 1 cells are conjunctively tuned to both motion direction (or disparity) and orientation. We predict: there are underlying horizontal connections linking cells tuned conjunctively to the same orientation and motion direction (or disparity). • A conjunction of color-orientation can be easy or difficult to find depending on the stimuli, since most V 1 cells are tuned only to orientation or only to color, and a small population of V 1 cells is broadly tuned to both orientation and color. Prediction: Color-orientation conjunction search can be made easier by adjusting the scale and/or density of the stimuli, since V 1 cells conjunctively tuned to both orientation and color are mainly tuned to a specific spatial frequency band. Stimuli for a conjunction search for target Response from a model without conjunction cells A colored bar evokes responses in cells tuned to orientation only or tuned to color only, On double feature searches: Target: circle, z = 0. 7 Curved line among straight lines vs. Straight among curved. firing rates for saliency is universal, and, given an input scene, the same firing rate from two V 1 (output) neurons selective to different features mean the same salience value of the two corresponding inputs even if, say, one of the cells is color selective, responding to a static red bar, and the other cell is tuned to motion, responding to a moving black dot. Usually, an image item, say, a red short bar, evokes responses from many cells with different optimal features and overlapping tuning curves or receptive fields. The actual input features have to be decoded in a complex and feature specific manner from the population responses. However, locating the most responsive cell to a scene locates the most salient item whether or not features can be decoded beforehand or simultaneously from the same cell population. It is economical not to use subsequent cell layers (whether they are feature tuned or not) for a saliency map; the small receptive fields in V 1 also mean that this saliency map can have a higher resolution. For more details, see “A saliency map in primary visual cortex” in Trends in Cognitive Sciences, Vol. 6, No. 1 January 2002, p. 9 -16. Target: curved, z = 1. 12 Target: straight, z = 0. 3 A colored bar evokes responses in cells tuned to orientation only, or tuned to color only, or tuned to both color and orientation. The conjunction cell experiences the least iso-feature suppression and enables pop-out. The responses from the orientation selective cells are visualized by the thickness or the black, oriented lines, from color tuned cells by the size of the colored circle, from conjunctively tuned cells by the size of the adequated colored and oriented ellipses. The horizontal connections link cells tuned to similar features (orientation, color, or both). Ellipse in circles vs. Circle in ellipses. Target: ellipse, z = 2. 8 Response from a model with conjunction cells This explains why the double feature advantage is stronger in motion-orientation double feature search than the color-orientation double feature search (Nothdurft 2000), since motion-orientation conjunction cells are more abundant in V 1 than the colororientation conjunction cells. The target red vertical bar evokes responses from 3 cell types: (1) orientation selective cells tuned to vertical, (2) color selective cells tuned to red, and (3) conjunctively tuned cells selective to red-vertical. All 3 cell types experience no iso-feature suppression, the most responsive of them should signal the target saliency. Assuming that cells tuned to the single features determine the ease of the corresponding single feature searches, then the double feature search should be no less difficult than the easier of the two single feature searches, and may be more efficient than the single feature searches if the conjunctively tuned cell is the most responsive.

Slides: 1

Download presentation