This paper reconsiders the diphone-based word segmentation model of Cairns, Shillcock, Chater, and Levy (1997) and Hockema (2006), previously thought to be unlearnable. A statistically principled learning model is developed using Bayes' theorem and reasonable assumptions about infants' implicit knowledge. The ability to recover phrase-medial word boundaries is tested using phonetic corpora derived from spontaneous interactions with children and adults. The (unsupervised and semisupervised) learning models are shown to exhibit several crucial properties. First, only a small amount of language exposure is required to achieve the model's ceiling performance, equivalent to between 1 day and 1 month of caregiver input. Second, the models are robust to variation, both in the free parameter and the input representation. Finally, both the learning and baseline models exhibit undersegmentation, argued to have significant ramifications for speech processing as a whole.
Phonological grammars characterize distinctions between relatively well-formed (unmarked) and relatively ill-formed (marked) phonological structures. We review evidence that markedness influences speech error probabilities. Specifically, although errors result in both unmarked as well as marked structures, there is a markedness asymmetry: errors are more likely to produce unmarked outcomes. We show that stochastic disruption to the computational mechanisms realizing a Harmonic Grammar (HG) can account for the broad empirical patterns of speech errors. We demonstrate that our proposal can account for the general markedness asymmetry. We also develop methods for linking particular HG proposals to speech error distributions, and illustrate these methods using a simple HG and a set of initial consonant errors in English.
It is likely that generalization of implicitly learned sound patterns to novel words and sounds is structured by a similarity metric, but how may this metric best be captured? We report on an experiment where participants were exposed to an artificial phonology, and frequency ratings were used to probe implicit abstraction of onset statistics. Non-words bearing an onset that was presented during initial exposure were subsequently rated most frequent, indicating that participants generalized onset statistics to new non-words. Participants also rated non-words with untrained onsets as somewhat frequent, indicating generalization to onsets that had not been used during the exposure phase. While generalization could be accounted for in terms of featural distance, it was insensitive to natural class structure. Generalization to untrained sounds was predicted better by models requiring prior linguistic knowledge (either traditional distinctive features or articulatory phonetic information) than by a model based on a linguistically naïve measure of acoustic similarity.
HIGHLIGHTS• A specific deficit underlies the linguistic impairment of patients with striatal lesions • Striatal lesions cause deficit in selection processes not in grammatical evaluation • Striatum selects linguistic alternatives computed in cortical language areas • Atrophy in dorsal striatum correlates with selection deficit in Huntington disease
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.