Motion of an extended boundary can be measured locally by neurons only orthogonal to its orientation (aperture problem) while this ambiguity is resolved for localized image features, such as corners or nonocclusion junctions. The integration of local motion signals sampled along the outline of a moving form reveals the object velocity. We propose a new model of V1-MT feedforward and feedback processing in which localized V1 motion signals are integrated along the feedforward path by model MT cells. Top-down feedback from MT cells in turn emphasizes model V1 motion activities of matching velocity by excitatory modulation and thus realizes an attentional gating mechanism. The model dynamics implement a guided filling-in process to disambiguate motion signals through biased on-center, off-surround competition. Our model makes predictions concerning the time course of cells in area MT and V1 and the disambiguation process of activity patterns in these areas and serves as a means to link physiological mechanisms with perceptual behavior. We further demonstrate that our model also successfully processes natural image sequences.
Abstract. The neural mechanisms underlying motion segregation and integration still remain unclear to a large extent. Local motion estimates often are ambiguous in the lack of form features, such as corners or junctions. Furthermore, even in the presence of such features, local motion estimates may be wrong if they were generated near occlusions or from transparent objects. Here, a neural model of visual motion processing is presented that involves early stages of the cortical dorsal and ventral pathways. We investigate the computational mechanisms of V1-MT feedforward and feedback processing in the perception of coherent shape motion. In particular, we demonstrate how modulatory MT-V1 feedback helps to stabilize localized feature signals at, e.g. corners, and to disambiguate initial flow estimates that signal ambiguous movement due to the aperture problem for single shapes. In cluttered environments with multiple moving objects partial occlusions may occur which, in turn, generate erroneous motion signals at points of overlapping form. Intrinsic-extrinsic region boundaries are indicated by local T-junctions of possibly any orientation and spatial configuration. Such junctions generate strong localized feature tracking signals that inject erroneous motion directions into the integration process. We describe a simple local mechanism of excitatory form-motion interaction that modifies spurious motion cues at T-junctions. In concert with local competitive-cooperative mechanisms of the motion pathway the motion signals are subsequently segregated into coherent representations of moving shapes. Computer simulations demonstrate the competency of the proposed neural model.
We have previously developed a neurodynamical model of motion segregation in cortical visual area V1 and MT of the dorsal stream. The model explains how motion ambiguities caused by the motion aperture problem can be solved for coherently moving objects of arbitrary size by means of cortical mechanisms. The major bottleneck in the development of a reliable biologically inspired technical system with real-time motion analysis capabilities based on this neural model is the amount of memory necessary for the representation of neural activation in velocity space. We propose a sparse coding framework for neural motion activity patterns and suggest a means by which initial activities are detected efficiently. We realize neural mechanisms such as shunting inhibition and feedback modulation in the sparse framework to implement an efficient algorithmic version of our neural model of cortical motion segregation. We demonstrate that the algorithm behaves similarly to the original neural model and is able to extract image motion from real world image sequences. Our investigation transfers a neuroscience model of cortical motion computation to achieve technologically demanding constraints such as real-time performance and hardware implementation. In addition, the proposed biologically inspired algorithm provides a tool for modeling investigations to achieve acceptable simulation time.
Abstract. In this contribution we extend existing methods for head pose estimation and investigate the use of local image phase for gaze detection. Moreover we describe how a small database of face images with given ground truth for head pose and gaze direction was acquired. With this database we compare two different computational approaches for extracting the head pose. We demonstrate that a simple implementation of the proposed methods without extensive training sessions or calibration is sufficient to accurately detect the head pose for human-computer interaction. Furthermore, we propose how eye gaze can be extracted based on the outcome of local filter responses and the detected head pose. In all, we present a framework where different approaches are combined to a single system for extracting information about the attentional state of a person.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.