Through theoretical discussion, literature review, and a computational model, this paper poses a challenge to the notion that perspective-taking involves a fixed architecture in which particular processes have priority. For example, some research suggests that egocentric perspectives can arise more quickly, with other perspectives (such as of task partners) emerging only secondarily. This theoretical dichotomy–between fast egocentric and slow other-centric processes–is challenged here. We propose a general view of perspective-taking as an emergent phenomenon governed by the interplay among cognitive mechanisms that accumulate information at different timescales. We first describe the pervasive relevance of perspective-taking to cognitive science. A dynamical systems model is then introduced that explicitly formulates the timescale interaction proposed. This model illustrates that, rather than having a rigid time course, perspective-taking can be fast or slow depending on factors such as task context. Implications are discussed, with ideas for future empirical research.
When people interact, aspects of their speech and language patterns often converge in interactions involving one or more languages. Most studies of speech convergence in conversations have examined monolingual interactions, whereas most studies of bilingual speech convergence have examined spoken responses to prompts. However, it is not uncommon in multilingual communities to converse in two languages, where each speaker primarily produces only one of the two languages. The present study examined complexity matching and lexical matching as two measures of speech convergence in conversations spoken in English, Spanish, or both languages. Complexity matching measured convergence in the hierarchical timing of speech, and lexical matching measured convergence in the frequency distributions of lemmas produced. Both types of matching were found equally in all three language conditions. Taken together, the results indicate that convergence is robust to monolingual and bilingual interactions because it stems from basic mechanisms of coordination and communication.
Classifying EEG responses to naturalistic acoustic stimuli is of theoretical and practical importance, but standard approaches are limited by processing individual channels separately on very short sound segments (a few seconds or less). Recent developments have shown classification for music stimuli (∼ 2 mins) by extracting spectral components from EEG and using convolutional neural networks (CNNs). This paper proposes an efficient method to map raw EEG signals to individual songs listened for end-to-end classification. EEG channels are treated as a dimension of a [Channel × Sample] image tile, and images are classified using CNNs. Our experimental results ( 88.7%) compete with state-of-the-art methods (85.0%), yet our classification task is more challenging by processing longer stimuli that were similar to each other in perceptual quality, and were unfamiliar to participants. We also adopt a transfer learning scheme using a pre-trained ResNet-50, confirming the effectiveness of transfer learning despite image domains being unrelated from each other.
Information retrieval from brain responses to auditory and visual stimuli has shown success through classification of song names and image classes presented to participants while recording EEG signals. Information retrieval in the form of reconstructing auditory stimuli has also shown some success, but here we improve on previous methods by reconstructing music stimuli well enough to be perceived and identified independently. Furthermore, deep learning models were trained on time-aligned music stimuli spectrum for each corresponding one-second window of EEG recording, which greatly reduces feature extraction steps needed when compared to prior studies. The NMED-Tempo and NMED-Hindi datasets of participants passively listening to full length songs were used to train and validate Convolutional Neural Network (CNN) regressors. The efficacy of raw voltage versus power spectrum inputs and linear versus mel spectrogram outputs were tested, and all inputs and outputs were converted into 2D images. The quality of reconstructed spectrograms was assessed by training classifiers which showed 81% accuracy for mel-spectrograms and 72% for linear spectrograms (10% chance accuracy). Lastly, reconstructions of auditory music stimuli were discriminated by listeners at an 85% success rate (50% chance) in a two-alternative matchto-sample task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.