“…For example, in models with discrete lexical representations such as the ''Cohort'' model (Marslen-Wilson, 1987) or the TRACE model (McClelland & Elman, 1986), high-frequency words would be processed faster than low-frequency words because frequency determines either the baseline activation level of each lexical unit (McClelland & Rumelhart, 1981;MarslenWilson, 1990) or the strength of the connections from sublexical to lexical units (MacKay, 1982(MacKay, , 1987. In distributed learning models, the representations of high-frequency words would be activated more rapidly because highfrequency mappings are better learned, resulting in stronger connection weights (Gaskell & Marslen-Wilson, 1997;Plaut, McClelland, Seidenberg, & Patterson, 1996).…”