Maria Staudte scite author profile

Psycholinguistic studies of situated language processing have revealed that gaze in the visual environment is tightly coupled with both spoken language comprehension and production. It has also been established that interlocutors monitor the gaze of their partners, a phenomenon called "joint attention", as a further means for facilitating mutual understanding. We hypothesise that humanrobot interaction will benefit when the robot's language-related gaze behaviour is similar to that of people, potentially providing the user with valuable non-verbal information concerning the robot's intended message or the robot's successful understanding. We report findings from two eye-tracking experiments demonstrating (1) that human gaze is modulated by both the robot speech and gaze, and (2) that human comprehension of robot speech is improved when the robot's real-time gaze behaviour is similar to that of humans.

show abstract

The Influence of Visual Uncertainty on Word Surprisal and Processing Effort

Ankener

Sekicki

Staudte

2018

Front. Psychol.

View full text Add to dashboard Cite

A word’s predictability or surprisal, as determined by cloze probabilities or language models (Frank, 2013) is related to processing effort, in that less expected words take more effort to process (Hale, 2001; Lau et al., 2013). A word’s surprisal, however, may also be influenced by the non-linguistic context, such as visual cues: In the visual world paradigm (VWP), anticipatory eye movements suggest that listeners exploit the scene to predict what will be mentioned next (Altmann and Kamide, 1999). How visual context affects surprisal and processing effort, however, remains unclear. Here, we present a series of four studies providing evidence on how visually-determined probabilistic expectations for a spoken target word, as indicated by anticipatory eye movements, predict graded processing effort for that word, as assessed by a pupillometric measure (the Index of Cognitive Activity, ICA). These findings are a clear and robust demonstration that the non-linguistic context can immediately influence both lexical expectations, and surprisal-based processing effort.

show abstract

The influence of speaker gaze on listener comprehension: Contrasting visual versus intentional accounts

et al. 2014

View full text Add to dashboard Cite

Eye’ll Help You Out! How the Gaze Cue Reduces the Cognitive Load Required for Reference Processing

Sekicki

Staudte

2018

Cognitive Science

View full text Add to dashboard Cite

Referential gaze has been shown to benefit language processing in situated communication in terms of shifting visual attention and leading to shorter reaction times on subsequent tasks. The present study simultaneously assessed both visual attention and, importantly, the immediate cognitive load induced at different stages of sentence processing. We aimed to examine the dynamics of combining visual and linguistic information in creating anticipation for a specific object and the effect this has on language processing. We report evidence from three visual‐world eye‐tracking experiments, showing that referential gaze leads to a shift in visual attention toward the cued object, which consequently lowers the effort required for processing the linguistic reference. Importantly, perceiving and following the gaze cue did not prove costly in terms of cognitive effort, unless the cued object did not fit the verb selectional preferences.

show abstract

Influence of speakers’ gaze on situated language comprehension: Evidence from Event-Related Potentials

Jachmann

Drenhaus

Staudte

et al. 2019

Brain and Cognition

View full text Add to dashboard Cite

Graded expectations in visually situated comprehension: Costs and benefits as indexed by the N400

et al. 2020

View full text Add to dashboard Cite

Recently, Ankener et al. (Frontiers in Psychology, 9, 2387, 2018) presented a visual world study which combined both attention and pupillary measures to demonstrate that anticipating a target results in lower effort to integrate that target (noun). However, they found no indication that the anticipatory processes themselves, i.e., the reduction of uncertainty about upcoming referents, results in processing effort (cf. Linzen and Jaeger, Cognitive Science, 40(6), 1382–1411, 2016). In contrast, Maess et al. (Frontiers in Human Neuroscience, 10, 1–11, 2016) found that more constraining verbs elicited a higher N400 amplitude than unconstraining verbs. The aim of the present study was therefore twofold: Firstly, we examined whether the graded ICA effect, which was previously found on the noun as a result of a likelihood manipulation, replicates in ERP measures. Secondly, we set out to investigate whether the processes leading to the generation of expectations (derived during verb and scene processing) induce an N400 modulation. Our results confirm that visual context is combined with the verb’s meaning to establish expectations about upcoming nouns and that these expectations affect the retrieval of the upcoming noun (modulated N400 on the noun). Importantly, however, we find no evidence for different costs in generating more or less specific expectations for upcoming nouns. Thus, the benefits of generating expectations are not associated with any costs in situated language comprehension.

show abstract

The Impact of Listener Gaze on Predicting Reference Resolution

Koleva

Villalba

Staudte

et al. 2015

View full text Add to dashboard Cite

We investigate the impact of listener's gaze on predicting reference resolution in situated interactions. We extend an existing model that predicts to which entity in the environment listeners will resolve a referring expression (RE). Our model makes use of features that capture which objects were looked at and for how long, reflecting listeners' visual behavior. We improve a probabilistic model that considers a basic set of features for monitoring listeners' movements in a virtual environment. Particularly, in complex referential scenes, where more objects next to the target are possible referents, gaze turns out to be beneficial and helps deciphering listeners' intention. We evaluate performance at several prediction times before the listener performs an action, obtaining a highly significant accuracy gain.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Maria Staudte

Investigating joint attention mechanisms through spoken human–robot interaction

Visual attention in spoken human-robot interaction

The Influence of Visual Uncertainty on Word Surprisal and Processing Effort

The influence of speaker gaze on listener comprehension: Contrasting visual versus intentional accounts

Eye’ll Help You Out! How the Gaze Cue Reduces the Cognitive Load Required for Reference Processing

Influence of speakers’ gaze on situated language comprehension: Evidence from Event-Related Potentials

Graded expectations in visually situated comprehension: Costs and benefits as indexed by the N400

The Impact of Listener Gaze on Predicting Reference Resolution

Contact Info

Product

Resources

About