Visual motion analysis has focused on decomposing image sequences into their component features. There has been little success at re-combining those features into moving objects. Here, a novel model of attentive visual motion processing is presented that addresses both decomposition of the signal into constituent features as well as the re-combination, or binding, of those features into wholes. A new feed-forward motion-processing pyramid is presented motivated by the neurobiology of primate motion processes. On this structure the Selective Tuning (ST) model for visual attention is demonstrated. There are three main contributions: (1) a new feed-forward motion processing hierarchy, the first to include a multi-level decomposition with local spatial derivatives of velocity; (2) examples of how ST operates on this hierarchy to attend to motion and to localize and label motion patterns; and (3) a new solution to the feature binding problem sufficient for grouping motion features into coherent object motion. Binding is accomplished using a top-down selection mechanism that does not depend on a single location-based saliency representation.
The perception of changes in the direction of objects that translate in space is an important function of our visual system. Here we investigate the brain electrical phenomena underlying such a function by using a combination of magnetoencephalography (MEG) and magnetic resonance imaging. We recorded MEG-evoked responses in 9 healthy human subjects while they discriminated the direction of a transient change in a translationally moving random dot pattern presented either to the right or to the left of a central fixation point. We found that responses reached their maximum in 2 main regions corresponding to motion processing area middle temporal (MT)/V5 contralateral to the stimulated visual field, and to the right inferior parietal lobe (rIPL). The activation latencies were very similar in both regions (~135 ms) following the direction change onset. Our findings suggest that area MT/V5 provides the strongest sensory signal in response to changes in the direction of translational motion, whereas area rIPL may be involved either in the sensory processing of transient motion signals or in the processing of signals related to orienting of attention.
Cortical area MT/V5 in the human occipito-temporal cortex is activated by visual motion. In this study, we use functional imaging to demonstrate that a subregion of MT/V5 is more strongly activated by unidirectional motion with speed gradients than by other motion patterns. Our results suggest that like the monkey homolog middle temporal area (MT), human MT/V5 contains neurons selective for the processing of speed gradients. Such neurons may constitute an intermediate stage of processing between neurons selective for the average speed of unidirectional motion and neurons selective for different combinations of speed gradient and different motion directions such as expanding optical flow patterns.
Selective Tuning (ST) presents a framework for modeling attention and in this work we show how it performs in covert visual search tasks by comparing its performance to human performance. Two implementations of ST have been developed. The Object Recognition Model recognizes and attends to simple objects formed by the conjunction of various features and the Motion Model recognizes and attends to motion patterns. The validity of the Object Recognition Model was first tested by successfully duplicating the results of Nagy and Sanchez. A second experiment was aimed at an evaluation of the model's performance against the observed continuum of search slopes for feature-conjunction searches of varying difficulty. The Motion Model was tested against two experiments dealing with searches in the visual motion domain. A simple odd-man-out search for counter-clockwise rotating octagons among identical clockwise rotating octagons produced linear increase in search time with the increase of set size. The second experiment was similar to one described by Thorton and Gilden. The results from both implementations agreed with the psychophysical data from the simulated experiments. We conclude that ST provides a valid explanatory mechanism for human covert visual search performance, an explanation going far beyond the conventional saliency map based explanations.
Abstract. The Selective Tuning Model is a proposal for modelling visual attention in primates and humans. Although supported by significant biological evidence, it is not without its weaknesses. The main one addressed by this paper is that the levels of representation on which it was previously demonstrated (spatial Gaussian pyramids) were not biologically plausible. The motion domain was chosen because enough is known about motion processing to enable a reasonable attempt at defining the feedforward pyramid. The effort is unique because it seems that no past model presents a motion hierarchy plus attention to motion. We propose a neurally-inspired model of the primate visual motion system attempting to explain how a hierarchical feedforward network consisting of layers representing cortical areas V1, MT, MST, and 7a detects and classifies different kinds of motion patterns. The STM model is then integrated into this hierarchy demonstrating that successfully attending to motion patterns, results in localization and labelling of those patterns.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.