Visualizing polysemy using LSA and the predication algorithm

Jorge-Botana, Guillermo; Gascón, José Antonio León; Olmos, Ricardo; Hassan-Montero, Yusef

doi:10.1002/asi.21355

Cited by 13 publications

(12 citation statements)

References 42 publications

(77 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Critical words were matched for mean log frequency (True/False = 1.31/1.26; Davis & Perea, 2005) and word length (True/False = 6.7/6.4 letters, range 3-11 letters). LSA semantic similarity values (SSV; see Landauer & Dumais, 1997) for the critical words were obtained using version 2 of the GallitoÓ software (http://www.elsemantico.com; Jorge-Botana, León, Olmos, & Hassan-Montero, 2010;Jorge-Botana, Olmos, & Barroso, 2012). This software uses a large training corpus to create representations of words and relationships between them within a multidimensional semantic space.…”

Section: Methodsmentioning

confidence: 99%

“If a lion could speak …”: Online sensitivity to propositional truth-value of unrealistic counterfactual sentences

Nieuwland

2013

Journal of Memory and Language

View full text Add to dashboard Cite

People can establish whether a sentence is hypothetically true even if what it describes can never be literally true given the laws of the natural world. Two event-related potential (ERP) experiments examined electrophysiological responses to sentences about unrealistic counterfactual worlds that require people to construct novel conceptual combinations and infer their consequences as the sentence unfolds in time (e.g., “If dogs had gills…”). Experiment 1 established that without this premise, described consequences (e.g., “Dobermans would breathe under water …”) elicited larger N400 responses than real-world true sentences. Incorporation of the counterfactual premise in Experiment 2 generated similar N400 effects of propositional truth-value in counterfactual and real-world sentences, suggesting that the counterfactual context eliminated the interpretive problems posed by locally anomalous sentences. This result did not depend on cloze probability of the sentences. In contrast to earlier findings regarding online comprehension of logical operators and counterfactuals, these results show that ongoing processing can be directly impacted by propositional truth-value, even that of unrealistic counterfactuals

show abstract

Section: Methodsmentioning

confidence: 99%

“If a lion could speak …”: Online sensitivity to propositional truth-value of unrealistic counterfactual sentences

Nieuwland

2013

Journal of Memory and Language

View full text Add to dashboard Cite

show abstract

“…The procedure for extracting the sample of words to be studied was as follows: we took polysemous words using two sources (Estévez, 1991;Jorge-Botana, León, Olmos, & Hassan-Montero, 2010). Using these two sources, our criteria to select the words for the polysemy group were: (1) that the words should have significantly more entries in the RAE (Spanish Royal Academy) dictionary than monosemous words (T(46) = 4.85, p < .005) and (2) that in Lexesp at least two meanings of each polysemous word were represented, checked using the authors' criteria with a visual sample of the first 100 semantic neighbors of each word.…”

Section: Methodsmentioning

confidence: 99%

“…They are static in the sense that they emulate permanent knowledge, biased in the sense that the meanings of words are represented in vectors based on their appearance in real language usage, and context-free in the sense that the vectors do not refer to any particular context. More precisely, they are an amalgam of many contexts, and the interaction with one of them is what dynamically generates a meaning (for a review see Jorge-Botana et al, 2010).…”

Section: "Efficient Then Inefficient"mentioning

confidence: 99%

How lexical ambiguity distributes activation to semantic neighbors

Jorge-Botana

Olmos

2014

Self Cite

View full text Add to dashboard Cite

The role which the diversity of a word's contexts plays in lexical access is currently the object of research. Vector-space models such as Latent Semantic Analysis (LSA) are useful to examine this role. Having an objective, discrete model of lexical representation allows us to objectify parameters in order to define contextual focalization in a more measurable way. In the first part of our study, we investigate whether certain empirical data on ambiguity can be modeled by means of an exclusively symbolic single representation model such as LSA and an excitatory-inhibitory mechanism such as the Construction-Integration framework. Our observations support the idea that some ambiguity effects could be explained by the contextual distribution using such a model. In the second part, we put abstract and concrete words to the test. Our LSA model (exclusively symbolic) and the excitatory-inhibitory mechanism can also explain the penalty paid by abstract words as they activate other words through semantic similarity and the advantage of concrete words in naming and semantic judgments, though it does not account for the advantage of concrete words in lexical decision tasks. The results of this second part are then discussed within the framework of the embodied/symbolic view of the language process. U n c o r r e c t e d p r o o f s - J o h n B e n j a m i n s P u b l i s h i n g C o m p a n y 68 Guillermo Jorge-Botana and Ricardo Olmos U n c o r r e c t e d p r o o f s - J o h n B e n j a m i n s P u b l i s h i n g C o m p a n y 76 Guillermo Jorge-Botana and Ricardo Olmos

show abstract

“…Training In the LSA training we used Gallito ® , a tool that has been used on other occasions for the creation of semantic spaces , 2010a, 2010b. Words included in a special stop-list for this domain were eliminated, as well as all function words, and words that do not occur at least three times.…”

Section: Methodsmentioning

confidence: 99%

“…6 We chose such a dimensionalization based on the assumptions made in some previous studies. In those studies it has been suggested that the optimal number of dimensions for specific domain corpora does not have to be extremely low, sometimes even approaching the 300 dimensions recommended by Landauer and Dumais (1997) for general domain corpora (see Jorge-Botana et al 2010b). Some of the most recent studies simply use 300 dimensions (Wild et al 2011).…”

Section: Methodsmentioning

confidence: 99%

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Jorge-Botana

Olmos

Barroso³

2012

Int J Speech Technol

Self Cite

View full text Add to dashboard Cite

Semantic technology is commonly used for two purposes in the field of IVR (Interactive Voice Response). The first is to correct the output of voice recognition devices based on coherence with a context. The second is to perform what is referred to as "call routing", requiring technology that categorizes utterances and returns a list of the most credible routes. Our paper focuses on the latter, aiming to use the Latent Semantic Analysis (LSA henceforth) computational model (Deerwester et al. in J. Am. Soc. Inf. Sci. 41:391-407, 1990) together with the Construction-Integration model (C-I henceforth), a psycholinguistically motivated algorithm (Kintsch in Int. J. Psychol. 33(6):411-420, 1998), to interpret, manage and successfully route user requests in an efficient and reliable manner. By efficient we mean that training is unnecessary when the destination model is altered, and exhaustive labeling of all utterances is not required, concentrating instead only on some sample destinations. By reliable we mean that the construction-integration algorithm attenuates the risks from intra-destination variability and word saliency. Technical and theoretical aspects are discussed. In A. Barroso PlusNet Solutions, C/Albarracín, 58. Local 12, 28037 Madrid, Spain addition, some destination assignment methods are tested and debated.

show abstract

Visualizing polysemy using LSA and the predication algorithm

Cited by 13 publications

References 42 publications

“If a lion could speak …”: Online sensitivity to propositional truth-value of unrealistic counterfactual sentences

“If a lion could speak …”: Online sensitivity to propositional truth-value of unrealistic counterfactual sentences

How lexical ambiguity distributes activation to semantic neighbors

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Contact Info

Product

Resources

About