Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1023
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings

Abstract: Ellipsis is a natural language phenomenon where part of a sentence is missing and its information must be recovered from its surrounding context, as in "Cats chase dogs and so do foxes.". Formal semantics has different methods for resolving ellipsis and recovering the missing information, but the problem has not been considered for distributional semantics, where words have vector embeddings and combinations thereof provide embeddings for sentences. In elliptical sentences these combinations go beyond linear a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
1
1

Relationship

3
2

Authors

Journals

citations
Cited by 12 publications
(20 citation statements)
references
References 30 publications
(33 reference statements)
0
18
0
Order By: Relevance
“…IS USE BERTp BERT f 0.31 0.30 0.37 0.56 0.34 0.27 0.36 0.65 0.67 0.52 0.65 0.76 0.80 0.68 0.67 0.79 0.71 0.58 0.44 0.70 0.74 0.76 0.70 0.76 was 0.53, and provide equal results to the state of the art of ELLSIM, which was 0.76 , both reported in Wijnholds and Sadrzadeh (2019). However, they are surpassed by fine-tuned BERT sentence embeddings and sentence encoders, that achieve the highest.…”
Section: Elliptical Phrase and Sick Datasetsmentioning
confidence: 64%
See 1 more Smart Citation
“…IS USE BERTp BERT f 0.31 0.30 0.37 0.56 0.34 0.27 0.36 0.65 0.67 0.52 0.65 0.76 0.80 0.68 0.67 0.79 0.71 0.58 0.44 0.70 0.74 0.76 0.70 0.76 was 0.53, and provide equal results to the state of the art of ELLSIM, which was 0.76 , both reported in Wijnholds and Sadrzadeh (2019). However, they are surpassed by fine-tuned BERT sentence embeddings and sentence encoders, that achieve the highest.…”
Section: Elliptical Phrase and Sick Datasetsmentioning
confidence: 64%
“…(3,4) The transitive verb disambiguation datasets of Grefenstette and Sadrzadeh (2011) (GS11) and Kartsaklis and Sadrzadeh (2013) (KS13a), and (5) the transitive sentence similarity dataset of Kartsaklis et al (2013) (KS13b). (6,7) We additionally test on two recent datasets (Wijnholds and Sadrzadeh, 2019) (ELLDIS and ELLSIM), which extend the KS13a and KS13b datasets to sentences with verb phrase ellipsis in them.…”
Section: Verb Disambiguation and Sentence Similaritymentioning
confidence: 99%
“…The match that our setting provides for human disambiguation judgements is being derived solely on the basis of observed cooccurrences between words and syntactic roles in a corpus, without any specification of content intrinsic to the word itself. Further experiments will be needed to extend this approach to larger datasets and to dialogue data and examine its effectiveness, perhaps using the work extending DS grammars to dialogue (Eshghi et al 2017), and possibly evaluating on the similarity dataset of Wijnholds and Sadrzadeh (2019) that extends the transitive sentence datasets used in this paper to a verb phrase elliptical setting.…”
Section: Discussionmentioning
confidence: 99%
“…Alternatively, one can build vectors for nouns and tensors for adjectives and verbs (and all other words with functional types) and use tensor contraction to build a vector for the sentence (Grefenstette and Sadrzadeh 2015;Kartsaklis and Sadrzadeh 2013). It has been shown that some of the tensor-based models improve on the results of the additive model, when considering the whole sentence (Grefenstette and Sadrzadeh 2015;Kartsaklis and Sadrzadeh 2013;Wijnholds and Sadrzadeh 2019); here, we focus on incremental composition as described above to investigate how the disambiguation process works word-by-word.…”
Section: A Disambiguation Taskmentioning
confidence: 99%
“…We use four disambiguation data sets to evaluate our models. Three of the four data sets -GS2011 (Grefenstette and Sadrzadeh, 2011a), GS2012, and KS2013-CoNLL (Kartsaklis et al, 2013) -are publicly available 5 , while ML2008 (Mitchell and Lapata, 2008) was obtained privately from the authors of Wijnholds and Sadrzadeh (2019). We show examples and statistics of the data sets in table 1.…”
Section: Data Setsmentioning
confidence: 99%