Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1329
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Numeracy in Word Embeddings

Abstract: Word embeddings are now pervasive across NLP subfields as the de-facto method of forming text representataions. In this work, we show that existing embedding models are inadequate at constructing representations that capture salient aspects of mathematical meaning for numbers, which is important for language understanding. Numbers are ubiquitous and frequently appear in text. Inspired by cognitive studies on how humans perceive numbers, we develop an analysis framework to test how well word embeddings capture … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
52
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 66 publications
(83 citation statements)
references
References 23 publications
0
52
0
Order By: Relevance
“…String Embeddings Recently, word and token embeddings have been analyzed to see if they record numerical properties (for example, magnitude or sorting order) (Wallace et al, 2019;Naik et al, 2019). This work finds evidence that common embedding approaches are unable to generalize to large numeric ranges, but that characterbased embeddings fare better than the rest.…”
Section: Related Workmentioning
confidence: 99%
“…String Embeddings Recently, word and token embeddings have been analyzed to see if they record numerical properties (for example, magnitude or sorting order) (Wallace et al, 2019;Naik et al, 2019). This work finds evidence that common embedding approaches are unable to generalize to large numeric ranges, but that characterbased embeddings fare better than the rest.…”
Section: Related Workmentioning
confidence: 99%
“…Training predict models on very large corpora has therefore become the de-facto standard approach in distributional semantics research for representing word meaning (Chersoni et al, 2020;Moreo et al, 2019;Naik et al, 2019).…”
Section: Approaches To Modelling Linguistic Distributional Knowledgementioning
confidence: 99%
“…As a result, distributional semantics and linguistic-simulation research have developed some different theoretical assumptions on how linguistic distributional knowledge should be modelled. The predominant view in distributional semantics research is based on a tacit "onesize-fits-all" assumption for how distributional information should best fit human data: predict models trained on very large (and noisy) corpora are the de facto standard for forming distributional word representations, regardless of the semantic task being modelled (e.g., Baroni et al, 2014;Naik et al, 2019). The implication of this assumption is that there exists an optimal LDM that is appropriate for modelling all forms of linguistic distributional knowledge in cognition.…”
Section: Approaches To Modelling Linguistic Distributional Knowledgementioning
confidence: 99%
“…Analysis of word embeddings and the structure of the learned feature space often reveals interesting language properties and is an important research direction (Köhn, 2015;Bolukbasi et al, 2016;Mimno and Thompson, 2017;Nakashole and Flauger, 2018;Naik et al, 2019;Ethayarajh et al, 2019). We show that graph-based embeddings can be a powerful tool for language analysis.…”
Section: Related Workmentioning
confidence: 77%