In this article we report a computational semantic analysis of the presidential candidates' speeches in the two major political parties in the USA. In Study One, we modeled the political semantic spaces as a function of party, candidate, and time of election, and findings revealed patterns of differences in the semantic representation of key political concepts and the changing landscapes in which the presidential candidates align or misalign with their parties in terms of the representation and organization of politically central concepts. Our models further showed that the 2016 US presidential nominees had distinct conceptual representations from those of previous election years, and these patterns did not necessarily align with their respective political parties' average representation of the key political concepts. In Study Two, structural equation modeling demonstrated that reported political engagement among voters differentially predicted reported likelihoods of voting for Clinton versus Trump in the 2016 presidential election. Study Three indicated that Republicans and Democrats showed distinct, systematic word association patterns for the same concepts/terms, which could be reliably distinguished using machine learning methods. These studies suggest that given an individual's political beliefs, we can make reliable predictions about how they understand words, and given how an individual understands those same words, we can also predict an individual's political beliefs. Our study provides a bridge between semantic space models and abstract representations of political concepts on the one hand, and the representations of political concepts and citizens' voting behavior on the other.
How do students gain scientific knowledge while reading expository text? This study examines the underlying neurocognitive basis of textual knowledge structure and individual readers’ cognitive differences and reading habits, including the influence of text and reader characteristics, on outcomes of scientific text comprehension. By combining fixation-related fMRI and multiband data acquisition, the study is among the first to consider self-paced naturalistic reading inside the MRI scanner. Our results revealed the underlying neurocognitive patterns associated with information integration of different time scales during text reading, and significant individual differences due to the interaction between text characteristics (e.g., optimality of the textual knowledge structure) and reader characteristics (e.g., electronic device use habits). Individual differences impacted the amount of neural resources deployed for multitasking and information integration for constructing the underlying scientific mental models based on the text being read. Our findings have significant implications for understanding science reading in a population that is increasingly dependent on electronic devices.
There has been a recent boom in research relating semantic space computational models to fMRI data, in an effort to better understand how the brain represents semantic information. In the first study reported here, we expanded on a previous study to examine how different semantic space models and modeling parameters affect the abilities of these computational models to predict brain activation in a datadriven set of 500 selected voxels. The findings suggest that these computational models may contain distinct types of semantic information that relate to different brain areas in different ways. On the basis of these findings, in a second study we conducted an additional exploratory analysis of theoretically motivated brain regions in the language network. We demonstrated that data-driven computational models can be successfully integrated into theoretical frameworks to inform and test theories of semantic representation and processing. The findings from our work are discussed in light of future directions for neuroimaging and computational research.Keywords LSA . HAL . Semantic space models . Coarse semantic coding . fMRI Latent semantic analysis (LSA; Landauer & Dumais, 1997) and the hyperspace analogue to language (HAL; Lund & Burgess, 1996) are among the most influential computational models of word meaning. LSA and HAL, among other socalled Bsemantic space models^or Bdistributional semantic models,^use word co-occurrence frequencies as the basic building blocks for word meaning (see Jones, Willits, & Dennis, 2015, for a recent review). In these models, the cooccurrence frequencies of a word with all the other documents (as in LSA) or all other words with which the word occurs (as in HAL) are used to build the vector representation for that word, typically based on a very large-scale text corpus. The resulting representation of any target word is a highdimensional vector with each dimension denoting either a word (word-to-word matrix) or a document (word-to-document matrix). The raw vectors may consist of thousands or tens of thousands of dimensions and are usually very sparse. Dimension reduction methods are often used to reduce the number of dimensions in these models. These standard methods used by LSA and HAL have since been further developed or expanded. For example, probabilistic LSA (Hoffman, 2001) and its fully Bayesian extension the Topic model (Griffiths, Steyvers, & Tenenbaum, 2007) can identify lexemes with multiple senses (Tomar et al., 2013) and generate semantic representations as probability distributions rather than points in a high-dimension space. Positive pointwise mutual information (PPMI) has been used in place of raw cooccurrence frequencies (Bullinaria & Levy, 2007). Zhao, Li, and Kohonen (2011) integrated these models into a selforganizing map framework, and Fyshe, Talukdar, Murphy, and Mitchell (2013) discussed how different types of constraints on what counts as a co-occurrence qualitatively affect semantic information. Evaluation of semantic space modelsComputational models...
Summaries generated from medical conversations can improve recall and understanding of care plans for patients and reduce documentation burden for doctors. Recent advancements in automatic speech recognition (ASR) and natural language understanding (NLU) offer potential solutions to generate these summaries automatically, but rigorous quantitative baselines for benchmarking research in this domain are lacking. In this paper, we bridge this gap for two tasks: classifying utterances from medical conversations according to (i) the SOAP section and (ii) the speaker role. Both are fundamental building blocks along the path towards an end-to-end, automated SOAP note for medical conversations. We provide details on a dataset that contains human and ASR transcriptions of medical conversations and corresponding machine learning optimized SOAP notes. We then present a systematic analysis in which we adapt an existing deep learning architecture to the two aforementioned tasks. The results suggest that modelling context in a hierarchical manner, which captures both word and utterance level context, yields substantial improvements on both classification tasks. Additionally, we develop and analyze a modular method for adapting our model to ASR output.
Despite diverse efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped source of insights. In this paper, we leverage this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels. We focus on the tasks of recognizing relevant diagnoses and abnormalities in the review of organ systems (RoS). One methodological challenge is that the conversations are long (around 1500 words), making it difficult for modern deep-learning models to use them as input. To address this challenge, we extract noteworthy utterances-parts of the conversation likely to be cited as evidence supporting some summary sentence. We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.