SummarySuccessful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex–that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3, 4, 5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements.
To efficiently perceive and respond to the external environment, our brain has to perceptually integrate or segregate stimuli of different modalities. The temporal relationship between the different sensory modalities is therefore essential for the formation of different multisensory percepts. In this magnetoencephalography study, we created a paradigm where an audio and a tactile stimulus were presented by an ambiguous temporal relationship so that perception of physically identical audiotactile stimuli could vary between integrated (emanating from the same source) and segregated. This bistable paradigm allowed us to compare identical bimodal stimuli that elicited different percepts, providing a possibility to directly infer multisensory interaction effects. Local differences in alpha power over bilateral inferior parietal lobules (IPLs) and superior parietal lobules (SPLs) preceded integrated versus segregated percepts of the two stimuli (audio and tactile). Furthermore, differences in long-range cortical functional connectivity seeded in rIPL (region of maximum difference) revealed differential patterns that predisposed integrated or segregated percepts encompassing secondary areas of all different modalities and prefrontal cortex. We showed that the prestimulus brain states predispose the perception of the audiotactile stimulus both in a global and a local manner. Our findings are in line with a recent consistent body of findings on the importance of prestimulus brain states for perception of an upcoming stimulus. This new perspective on how stimuli originating from different modalities are integrated suggests a non-modality specific network predisposing multisensory perception.
Categories describe semantic divisions between classes of objects and category-based models are widely used for investigation of the conceptual system. One critical issue in this endeavour is the isolation of conceptual from perceptual contributions to category-differences. An unambiguous way to address this confound is combining multiple input-modalities. To this end, we showed participants person/place stimuli using name and picture modalities. Using multivariate methods, we searched for category-sensitive neural patterns shared across input-modalities and thus independent from perceptual properties. The millisecond temporal resolution of magnetoencephalography (MEG) allowed us to consider the precise timing of conceptual access and, by confronting latencies between the two modalities (“time generalization”), how latencies of processing depends on the input-modality. Our results identified category-sensitive conceptual representations common between modalities at three stages and that conceptual access for words was delayed by about 90 msec with respect to pictures. We also show that for pictures, the first conceptual pattern of activity (shared between both words and pictures) occurs as early as 110 msec. Collectively, our results indicated that conceptual access at the category-level is a multistage process and that different delays in access across these two input-modalities determine when these representations are activated.
Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators' agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, particularly in test sets, to better account for the different points of view expressed online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.