Proceedings of the 19th International Conference on Computational Linguistics - 2002
DOI: 10.3115/1072228.1072235
|View full text |Cite
|
Sign up to set email alerts
|

The computation of word associations

Abstract: It is shown that basic language processes such as the production of free word associations and the generation of synonyms can be simulated using statistical models that analyze the distribution of words in large text corpora. According to the law of association by contiguity, the acquisition of word associations can be explained by Hebbian learning. The free word associations as produced by subjects on presentation of single stimulus words can thus be predicted by applying first-order statistics to the frequen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
16
0

Year Published

2004
2004
2014
2014

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 78 publications
(17 citation statements)
references
References 6 publications
1
16
0
Order By: Relevance
“…With the EAT dataset, the gold-standard words were the original stimuli from EAT, and the cue words were the associated words that were most frequently produced by respondents in the original EAT experiment (Kiss et al, 1973). Rapp (2014) has argued that corpus-based computation of reverse-associations is a reasonable test case for multi-cued word search. However, Rapp also notes that in many cases, suggestions provided by a corpus-based system are quite reasonable, but are not correct for the EAT dataset.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…With the EAT dataset, the gold-standard words were the original stimuli from EAT, and the cue words were the associated words that were most frequently produced by respondents in the original EAT experiment (Kiss et al, 1973). Rapp (2014) has argued that corpus-based computation of reverse-associations is a reasonable test case for multi-cued word search. However, Rapp also notes that in many cases, suggestions provided by a corpus-based system are quite reasonable, but are not correct for the EAT dataset.…”
Section: Discussionmentioning
confidence: 99%
“…Given that the gold-standard targets in the shared task are original stimulus words form the EAT collection, we can use a special restriction -restrict the candidates to just the EAT stimuli word-list (Rapp, 2014). Notably, this is a very specific restriction, suited to the specific dataset, and not applicable to the general case of multi-cue associations or tip-of-the-tongue word searches.…”
mentioning
confidence: 99%
“…Models based on first-order co-occurrence (collocations) outperform models based on vector similarity. This superiority, however, is not validated via a direct comparison: results were obtained by studies with different features and goals (see Rapp (2014) for a review; see Griffiths et al (2007) and Smith et al (2013) for evaluations of models based on Latent Semantic Analysis). A specific feature of successful studies on the multiword association task is that they introduce an element of directionality (Rapp, 2013;Rapp, 2014), which allows a correct implementation of the directionality of the modeled effects (from stimulus to response).…”
Section: Related Workmentioning
confidence: 99%
“…Let us now comment on the overall character of the shared task. It should be noted that this task is actually the reverse association task as described in Rapp (2013Rapp ( , 2014. That is, the shared task participants were supposed to consider the associations from the EAT as their given words, and their task was to determine the original stimulus words.…”
Section: Given Wordsmentioning
confidence: 99%
“…During evaluation, case distinctions were not taken into account. 6 From http://pageperso.lif.univ-mrs.fr/~michael.zock/CogALex-IV/cogalex-webpage/pst.html the full data sets can be downloaded 7 Note that the results of up to 54% reported in Rapp (2014) were obtained using different data sets and severely restricted vocabularies, so these cannot be used for comparison. 8 For such reasons we had requested to include such information in the papers.…”
mentioning
confidence: 99%