2010
DOI: 10.1002/asi.21355
|View full text |Cite
|
Sign up to set email alerts
|

Visualizing polysemy using LSA and the predication algorithm

Abstract: Context is a determining factor in language and plays a decisive role in polysemic words. Several psycholinguistically motivated algorithms have been proposed to emulate human management of context, under the assumption that the value of a word is evanescent and takes on meaning only in interaction with other structures. The predication algorithm (Kintsch, 2001), for example, uses a vector representation of the words produced by LSA (Latent Semantic Analysis) to dynamically simulate the comprehension of predic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 42 publications
(77 reference statements)
0
12
0
Order By: Relevance
“…Critical words were matched for mean log frequency (True/False = 1.31/1.26; Davis & Perea, 2005) and word length (True/False = 6.7/6.4 letters, range 3-11 letters). LSA semantic similarity values (SSV; see Landauer & Dumais, 1997) for the critical words were obtained using version 2 of the GallitoÓ software (http://www.elsemantico.com; Jorge-Botana, León, Olmos, & Hassan-Montero, 2010;Jorge-Botana, Olmos, & Barroso, 2012). This software uses a large training corpus to create representations of words and relationships between them within a multidimensional semantic space.…”
Section: Methodsmentioning
confidence: 99%
“…Critical words were matched for mean log frequency (True/False = 1.31/1.26; Davis & Perea, 2005) and word length (True/False = 6.7/6.4 letters, range 3-11 letters). LSA semantic similarity values (SSV; see Landauer & Dumais, 1997) for the critical words were obtained using version 2 of the GallitoÓ software (http://www.elsemantico.com; Jorge-Botana, León, Olmos, & Hassan-Montero, 2010;Jorge-Botana, Olmos, & Barroso, 2012). This software uses a large training corpus to create representations of words and relationships between them within a multidimensional semantic space.…”
Section: Methodsmentioning
confidence: 99%
“…The procedure for extracting the sample of words to be studied was as follows: we took polysemous words using two sources (Estévez, 1991;Jorge-Botana, León, Olmos, & Hassan-Montero, 2010). Using these two sources, our criteria to select the words for the polysemy group were: (1) that the words should have significantly more entries in the RAE (Spanish Royal Academy) dictionary than monosemous words (T(46) = 4.85, p < .005) and (2) that in Lexesp at least two meanings of each polysemous word were represented, checked using the authors' criteria with a visual sample of the first 100 semantic neighbors of each word.…”
Section: Methodsmentioning
confidence: 99%
“…They are static in the sense that they emulate permanent knowledge, biased in the sense that the meanings of words are represented in vectors based on their appearance in real language usage, and context-free in the sense that the vectors do not refer to any particular context. More precisely, they are an amalgam of many contexts, and the interaction with one of them is what dynamically generates a meaning (for a review see Jorge-Botana et al, 2010).…”
Section: "Efficient Then Inefficient"mentioning
confidence: 99%
“…Training In the LSA training we used Gallito ® , a tool that has been used on other occasions for the creation of semantic spaces , 2010a, 2010b. Words included in a special stop-list for this domain were eliminated, as well as all function words, and words that do not occur at least three times.…”
Section: Methodsmentioning
confidence: 99%
“…6 We chose such a dimensionalization based on the assumptions made in some previous studies. In those studies it has been suggested that the optimal number of dimensions for specific domain corpora does not have to be extremely low, sometimes even approaching the 300 dimensions recommended by Landauer and Dumais (1997) for general domain corpora (see Jorge-Botana et al 2010b). Some of the most recent studies simply use 300 dimensions (Wild et al 2011).…”
Section: Methodsmentioning
confidence: 99%