Changes in some lexical features of language have been associated with the onset and progression of Alzheimer's disease. Here we describe a method to extract key features from discourse transcripts, which we evaluated on non-scripted news conferences from President Ronald Reagan, who was diagnosed with Alzheimer's disease in 1994, and President George Herbert Walker Bush, who has no known diagnosis of Alzheimer's disease. Key word counts previously associated with cognitive decline in Alzheimer's disease were extracted and regression analyses were conducted. President Reagan showed a significant reduction in the number of unique words over time and a significant increase in conversational fillers and non-specific nouns over time. There was no significant trend in these features for President Bush.
Reductions in spoken language complexity have been associated with the onset of various neurological disorders. The objective of this study is to analyze whether similar trends are found in professional football players who are at risk for chronic traumatic encephalopathy. We compare changes in linguistic complexity (as indexed by the type-to-token ratio and lexical density) measured from the interview transcripts of players in the National Football League (NFL) to those measured from interview transcripts of coaches and/or front-office NFL executives who have never played professional football. A multilevel mixed model analysis reveals that exposure to the high-impact sport (vs no exposure) was associated with an overall decline in language complexity scores over time. This trend persists even after controlling for age as a potential confound. The results set the stage for a prospective study to test the hypothesis that language complexity decline is a harbinger of chronic traumatic encephalopathy.
Purpose
Subjective speech intelligibility assessment is often preferred over more objective approaches that rely on transcript scoring. This is, in part, because of the intensive manual labor associated with extracting objective metrics from transcribed speech. In this study, we propose an automated approach for scoring transcripts that provides a holistic and objective representation of intelligibility degradation stemming from both segmental and suprasegmental contributions, and that corresponds with human perception.
Method
Phrases produced by 73 speakers with dysarthria were orthographically transcribed by 819 listeners via Mechanical Turk, resulting in 63,840 phrase transcriptions. A protocol was developed to filter the transcripts, which were then automatically analyzed using novel algorithms developed for measuring phoneme and lexical segmentation errors. The results were compared with manual labels on a randomly selected sample set of 40 transcribed phrases to assess validity. A linear regression analysis was conducted to examine how well the automated metrics predict a perceptual rating of severity and word accuracy.
Results
On the sample set, the automated metrics achieved 0.90 correlation coefficients with manual labels on measuring phoneme errors, and 100% accuracy on identifying and coding lexical segmentation errors. Linear regression models found that the estimated metrics could predict a significant portion of the variance in perceptual severity and word accuracy.
Conclusions
The results show the promising development of an objective speech intelligibility assessment that identifies intelligibility degradation on multiple levels of analysis.
In English, the predominance of stressed syllables as word onsets aids lexical segmentation in degraded listening conditions. Yet it is unlikely that these findings would readily transfer to languages with differing rhythmic structure. In the current study, the authors seek to examine whether listeners exploit both common word size (syllable number) and stress cues to aid lexical segmentation in Spanish. Forty-seven Spanish-speaking listeners transcribed two-word Spanish phrases in noise. As predicted by the statistical probabilities of Spanish, error analysis revealed that listeners preferred two- and three-syllable words with penultimate stress in their attempts to parse the degraded speech signal. These findings provide insight into the importance of stress in tandem with word size in the segmentation of Spanish words and suggest testable hypotheses for cross-linguistic studies that examine the effects of degraded acoustic cues on lexical segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.