Markus Huff scite author profile

Human observers are capable of tracking multiple objects among identical distractors based only on their spatiotemporal information. Since the first report of this ability in the seminal work of Pylyshyn and Storm (1988, Spatial Vision, 3, 179-197), multiple object tracking has attracted many researchers. A reason for this is that it is commonly argued that the attentional processes studied with the multiple object paradigm apparently match the attentional processing during real-world tasks such as driving or team sports. We argue that multiple object tracking provides a good mean to study the broader topic of continuous and dynamic visual attention. Indeed, several (partially contradicting) theories of attentive tracking have been proposed within the almost 30 years since its first report, and a large body of research has been conducted to test these theories. With regard to the richness and diversity of this literature, the aim of this tutorial review is to provide researchers who are new in the field of multiple object tracking with an overview over the multiple object tracking paradigm, its basic manipulations, as well as links to other paradigms investigating visual attention and working memory. Further, we aim at reviewing current theories of tracking as well as their empirical evidence. Finally, we review the state of the art in the most prominent research fields of multiple object tracking and how this research has helped to understand visual attention in dynamic settings.

show abstract

Changes in situation models modulate processes of event perception in audiovisual narratives.

Huff¹,

Meitz

Papenmeier³

2014

Journal of Experimental Psychology: Learning, Memory, and Cogni

View full text Add to dashboard Cite

Humans understand text and film by mentally representing their contents in situation models. These describe situations using dimensions like time, location, protagonist, and action. Changes in 1 or more dimensions (e.g., a new character enters the scene) cause discontinuities in the story line and are often perceived as boundaries between 2 meaningful units. Recent theoretical advances in event perception led to the assumption that situation models are represented in the form of event models in working memory. These event models are updated at event boundaries. Points in time at which event models are updated are important: Compared with situations during an ongoing event, situations at event boundaries are remembered more precisely and predictions about what happens next become less reliable. We hypothesized that these effects depend on the number of changes in the situation model. In 2 experiments, we had participants watch sitcom episodes and measured recognition memory and prediction performance for event boundaries that contained a change in 1, 2, 3, or 4 dimensions. Results showed a linear relationship: the more dimensions changed, the higher recognition performance was. At the same time, participants' predictions became less reliable with an increasing number of dimension changes. These results suggest that updating of event models at event boundaries occurs incrementally.

show abstract

Eye movements across viewpoint changes in multiple object tracking

Huff¹,

Papenmeier²,

Jahn

et al. 2010

Visual Cognition

View full text Add to dashboard Cite

Observers can visually track multiple objects that move independently even if the scene containing the moving objects is rotated in a smooth way. Abrupt scene rotations yield tracking more difficult but not impossible. For nonrotated, stable dynamic displays, the strategy of looking at the targets' centroid has been shown to be of importance for visual tracking. But which factors determine successful visual tracking in a nonstable dynamic display? We report two eye tracking experiments that present evidence for centroid looking. Across abrupt viewpoint changes, gaze on the centroid is more stable than gaze on targets indicating a process of realigning targets as a group. Further, we show that the relative importance of centroid looking increases with object speed.Watching and understanding a football game on television requires the ability to keep track of multiple moving objects: For example, in a scene in front of the goal, at least two players (offence and goal-keeper) and the ball have to be tracked. More complex situations (e.g., an off-side position) involve even more players. In contrast to real life scenarios, in television a tactic move (e.g., a counterattack) is often shown in sequential shots from

show abstract

Tracking by location and features: Object correspondence across spatiotemporal discontinuities during multiple object tracking.

Papenmeier¹,

Meyerhoff

Jahn

et al. 2014

Journal of Experimental Psychology: Human Perception and Perfor

View full text Add to dashboard Cite

We examined whether surface feature information is utilized to track the locations of multiple objects. In particular, we tested whether surface features and spatiotemporal information are weighted according to their availability and reliability. Accordingly, we hypothesized that surface features should affect location tracking across spatiotemporal discontinuities. Three kinds of spatiotemporal discontinuities were implemented across five experiments: abrupt scene rotations, abrupt zooms, and a reduced presentation frame rate. Objects were briefly colored across the spatiotemporal discontinuity. Distinct coloring that matched spatiotemporal information across the discontinuity improved tracking performance as compared with homogeneous coloring. Swapping distinct colors across the discontinuity impaired performance. Correspondence by color was further demonstrated by more mis-selected distractors appearing in a former target color than distractors appearing in a former distractor color in the swap condition. This was true even when color never supported tracking and when participants were instructed to ignore color. Furthermore, effects of object color on tracking occurred with unreliable spatiotemporal information but not with reliable spatiotemporal information. Our results demonstrate that surface feature information can be utilized to track the locations of multiple objects. This is in contrast to theories stating that objects are tracked based on spatiotemporal information only. We introduce a flexible-weighting tracking account stating that spatiotemporal information and surface features are both utilized by the location tracking mechanism. The two sources of information are weighted according to their availability and reliability. Surface feature effects on tracking are particularly likely when distinct surface feature information is available and spatiotemporal information is unreliable.

show abstract

Visual target detection is impaired at event boundaries

2012

View full text Add to dashboard Cite

Semantic congruency but not temporal synchrony enhances long-term memory performance for audio-visual scenes

Meyerhoff

Huff

2015

Mem Cogn

View full text Add to dashboard Cite

Human long-term memory for visual objects and scenes is tremendous. Here, we test how auditory information contributes to long-term memory performance for realistic scenes. In a total of six experiments, we manipulated the presentation modality (auditory, visual, audio-visual) as well as semantic congruency and temporal synchrony between auditory and visual information of brief filmic clips. Our results show that audio-visual clips generally elicit more accurate memory performance than unimodal clips. This advantage even increases with congruent visual and auditory information. However, violations of audio-visual synchrony hardly have any influence on memory performance. Memory performance remained intact even with a sequential presentation of auditory and visual information, but finally declined when the matching tracks of one scene were presented separately with intervening tracks during learning. With respect to memory performance, our results therefore show that audio-visual integration is sensitive to semantic congruency but remarkably robust against asymmetries between different modalities.

show abstract

Conflicting motion information impairs multiple object tracking

St.Clair

Huff²,

Seiffert

2010

JOV

View full text Add to dashboard Cite

People can keep track of target objects as they move among identical distractors using only spatiotemporal information. We investigated whether or not participants use motion information during the moment-to-moment tracking of objects by adding motion to the texture of moving objects. The texture either remained static or moved relative to the object's direction of motion, either in the same direction, the opposite direction, or orthogonal to each object's trajectory. Results showed that, compared to the static texture condition, tracking performance was worse when the texture moved in the opposite direction of the object and better when the texture moved in the same direction as the object. Our results support the conclusion that motion information is used during the moment-to-moment tracking of objects. Motion information may either affect a representation of position or be used to periodically predict the future location of targets.

show abstract

Distance Matters: Spatial Contiguity Effects as Trade‐Off between Gaze Switches and Memory Load

Bauhoff¹,

Huff

Schwan³

2012

Applied Cognitive Psychology

View full text Add to dashboard Cite

The present study combined the approaches of multimedia learning and of comparative visual search (Hardiess, Gillner, & Mallot, 2008) in order to analyse the processing of spatially separated information. Participants were asked to compare two depictions of a mechanical pendulum clock to detect no, one, or two differences between them. The spatial distance between the two depictions was varied, and participants received either stimulus-related information about the functionalities of pendulum clocks or stimulus-unrelated information about the design of cuckoo clocks. The study demonstrates a trade-off between gaze movement and working memory use. We observed fewer gaze shifts with increasing distance between the pictures, suggesting higher working memory use. The findings indicate that the distance between two pictures, domain knowledge and visual working memory span are important factors that determine memory load required for processing split information sources.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Markus Huff

Studying visual attention using the multiple object tracking paradigm: A tutorial review

Changes in situation models modulate processes of event perception in audiovisual narratives.

Eye movements across viewpoint changes in multiple object tracking

Tracking by location and features: Object correspondence across spatiotemporal discontinuities during multiple object tracking.

Visual target detection is impaired at event boundaries

Semantic congruency but not temporal synchrony enhances long-term memory performance for audio-visual scenes

Conflicting motion information impairs multiple object tracking

Distance Matters: Spatial Contiguity Effects as Trade‐Off between Gaze Switches and Memory Load

Contact Info

Product

Resources

About