2018
DOI: 10.3758/s13428-018-1017-8
|View full text |Cite
|
Sign up to set email alerts
|

A database of orthography-semantics consistency (OSC) estimates for 15,017 English words

Abstract: Orthography-semantics consistency (OSC) is a measure that quantifies the degree of semantic relatedness between a word and its orthographic relatives. OSC is computed as the frequency-weighted average semantic similarity between the meaning of a given word and the meanings of all the words containing that very same orthographic string, as captured by distributional semantic models. We present a resource including optimized estimates of OSC for 15,017 English words. In a series of analyses, we provide a progres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

8
91
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 53 publications
(104 citation statements)
references
References 51 publications
(66 reference statements)
8
91
1
Order By: Relevance
“…The interaction between embedded word frequency and LSA semantic relatedness was not significant, suggesting that embedded words were activated independently of whether or not they shared a semantic relationship with the whole word. The main effect of embedded word frequency is consistent with a growing body of evidence (e.g., Beyersmann et al, 2015;Beyersmann, Cavalli, Casalis, & Colé, 2016;Marelli & Amenta, 2018;Taft, Li, & Beyersmann, 2018), suggesting that embedded words are always activated independently of whether they are accompanied by an affix (as in farm + er or corn + er) or a non-affix (as in cash + ew). The activation of embedded words (e.g., cash) then generates lexical competition with the whole word (e.g., cashew), thus leading to an increase in response times across…”
Section: Accepted Manuscriptsupporting
confidence: 84%
“…The interaction between embedded word frequency and LSA semantic relatedness was not significant, suggesting that embedded words were activated independently of whether or not they shared a semantic relationship with the whole word. The main effect of embedded word frequency is consistent with a growing body of evidence (e.g., Beyersmann et al, 2015;Beyersmann, Cavalli, Casalis, & Colé, 2016;Marelli & Amenta, 2018;Taft, Li, & Beyersmann, 2018), suggesting that embedded words are always activated independently of whether they are accompanied by an affix (as in farm + er or corn + er) or a non-affix (as in cash + ew). The activation of embedded words (e.g., cash) then generates lexical competition with the whole word (e.g., cashew), thus leading to an increase in response times across…”
Section: Accepted Manuscriptsupporting
confidence: 84%
“…OSC was computed as the frequency-weighted average of the cosine proximity (based on the semantic space just described) between the target vector and each of its orthographic relatives. These latter were defined as any words embedding the target (differently from the original 2015 paper, we did not include an onset-specific positional constraint in the selection of relatives, as we observed that it leads to a worse measure performance; see Marelli & Amenta, 2018).…”
Section: Methodsmentioning
confidence: 99%
“…and regardless of the relationship between the relative and the target (i.e., relatives are not necessarily morphologically or semantically related to the target, as seen in the examples above). For example, the string widow is contained in widower, widowed , and widowhood; therefore, all these words will be considered orthographic relatives of “widow” (for more details on how the orthographic relatives are defined and validation of this procedure, see Marelli & Amenta, 2018). Because all these words are associated to the meaning of WIDOW, OSC will be high (i.e., the semantic similarity between the target— widow in this case—and all its neighbours is high).…”
Section: Introduction: the Relevance Of Form–meaning Mapping For Wordmentioning
confidence: 99%
See 2 more Smart Citations