2017
DOI: 10.1007/s10994-017-5634-8
|View full text |Cite
|
Sign up to set email alerts
|

The mechanism of additive composition

Abstract: Additive composition (Foltz et al. in Discourse Process 15:285-307, 1998; Landauer and Dumais in Psychol Rev 104(2):211, 1997; Mitchell and Lapata in Cognit Sci 34(8):1388-1429, 2010) is a widely used method for computing meanings of phrases, which takes the average of vector representations of the constituent words. In this article, we prove an upper bound for the bias of additive composition, which is the first theoretical analysis on compositional frameworks from a machine learning point of view. The bound… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 47 publications
(74 reference statements)
0
15
0
Order By: Relevance
“…First, choose a word w. Then, for each window s containing w, take the average of the vectors of the words in s and denote it as v s . Now, take the average of v s for all the windows s containing w, and denote the average as u. Theorem 1 says that u can be mapped to the word vector v w by a linear transformation that does not depend on w. This linear structure may also have connections to some other phenomena related to linearity, e.g., Gittens et al (2017) and Tian et al (2017). Exploring such connections is left for future work.…”
Section: Gaussian Walk Modelmentioning
confidence: 99%
“…First, choose a word w. Then, for each window s containing w, take the average of the vectors of the words in s and denote it as v s . Now, take the average of v s for all the windows s containing w, and denote the average as u. Theorem 1 says that u can be mapped to the word vector v w by a linear transformation that does not depend on w. This linear structure may also have connections to some other phenomena related to linearity, e.g., Gittens et al (2017) and Tian et al (2017). Exploring such connections is left for future work.…”
Section: Gaussian Walk Modelmentioning
confidence: 99%
“…However, previous model designs mostly rely on linguistic intuitions (Paperno et al, 2014, inter alia), whereas our model has an exact logic interpretation. Furthermore, by using additive composition we enjoy a learning guarantee (Tian et al, 2015).…”
Section: Sentence Completionmentioning
confidence: 99%
“…The rationale for our model is as follows. First, recent research has shown that additive composition of word vectors is an approximation to the situation where two words have overlapping context (Tian et al, 2015); therefore, it is suitable to implement an "and" or intersection operation (Section 3). We design our model such that the resulted distributional representations are expected to have additive compositionality.…”
Section: Introductionmentioning
confidence: 99%
“…This inequality reflects the difference between the importance of the constituents (i.e. the word embeddings Tian et al (2017), coefficients α, β are scalars drawn from a monotonic function. In this work, we consider that a reasonable choice for such a monotonic function is the Shannon's entropy (Shannon, 1949;Charniak, 1996;Aizawa, 2003).…”
Section: Composition In Distributional Semanticsmentioning
confidence: 99%