Speech scientists have long known that speech perfection is a fiction. Everyday speech is complex, variable, and, ultimately, elusive. Although attempts have been made to test spoken-word recognition models with every day speech (e.g., Bard, Sotillo, Kelly, & Aylett, 2001;Kemps, Ernestus, Schreuder, & Baayen, 2004;McAllister, 1991;Mehta & Cutler, 1988), such studies are in a minority. There are two main reasons for this. First, studying conversational speech, by definition, limits the amount of experimental control one has over the input. Studying laboratory speech maximizes such control-for better or for worse. Second, attempting to define the concept of conversational speech itself opens a can of worms, because speech styles vary along quasiorthogonal dimensions, such as spontaneity (e.g., read, rehearsed, unscripted), articulatory effort (e.g., hyper-articulated, hypo-articulated), situation (e.g., monologue, dialogue), source (e.g., healthy speaker, dysarthric speaker), and so forth. These speech styles differ in the extent to which they contain the quantity and quality of acoustic cues available in carefully articulated laboratory speech, such as vowel reduction or phoneme elision/ assimilation (cf. Duez, 1995;Hawkins, 2003;Hawkins & Smith, 2001;Tiffany, 1959;Uchanski, 2005), which makes the selection of stimuli for perceptual experiments rather arbitrary. For these reasons, the investigation of spoken-word recognition has traditionally relied on either carefully articulated speech produced in the laboratory or (re)synthesized speech.
Artificial NormalityPerceptual studies using tightly controlled laboratory speech have been successful in establishing key constructs for models of spoken-word recognition, such as lexical competition, bottom-up versus top-down flow, uniqueness point, and episodic versus abstract lexical representations. Yet the external validity of such constructs remains debatable. For instance, the finding that a word can be identified as soon as the sensory input makes it uniquely distinguishable from its competitors (see, e.g., misdem for misdemeanor) has been used to support the claim that lexical candidates are activated sequentially as the signal unfolds (e.g., Grosjean, 1980;Marslen-Wilson, 1987 Much of what we know about spoken-word recognition comes from studies relying on speech stimuli either carefully produced in the laboratory or computer altered. Although such stimuli have allowed key constructs to be highlighted, the extent to which these constructs are operative in the processing of everyday speech is unclear. We argue that studying the recognition of naturally occurring degraded speech, such as that produced by individuals with neurological disease, can improve the external validity of existing spoken-word recognition models. This claim is illustrated in an experiment on the effect of talker-specific (indexical) variations on lexical access. We found that talker specificity effects, wherein words are better recalled if played in the same voice than in a different voice betwe...