2015
DOI: 10.1162/tacl_a_00143
|View full text |Cite|
|
Sign up to set email alerts
|

From Paraphrase Database to Compositional Paraphrase Model and Back

Abstract: The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates. However, it is still unclear how it can best be used, due to the heuristic nature of the confidences and its necessarily incomplete coverage. We propose models to leverage the phrase pairs from the PPDB to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB's internal scores while simultaneously improvin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
242
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 211 publications
(251 citation statements)
references
References 25 publications
(56 reference statements)
4
242
1
Order By: Relevance
“…objectives based on the distributional hypothesis are probably not to blame, as word vectors trained without relying on the distributional hypothesis, such as those of Wieting et al (2015), still exhibit non-normality to some degree. The actual causes remain to be determined.…”
Section: Discussionmentioning
confidence: 99%
“…objectives based on the distributional hypothesis are probably not to blame, as word vectors trained without relying on the distributional hypothesis, such as those of Wieting et al (2015), still exhibit non-normality to some degree. The actual causes remain to be determined.…”
Section: Discussionmentioning
confidence: 99%
“…(3) in our case, both the encoder and the inverse of the decoder are capable of producing a vector representation per time step in a given sentence, although during training, only the last one is re- 1 Arora et al (2017); 2 Wieting et al (2015); 3 Wieting and Gimpel (2018); 4 Conneau et al (2017); 5 Wieting and Gimpel (2018); 6−10 Agirre et al (2012Agirre et al ( , 2014Agirre et al ( , 2015Agirre et al ( , 2016; 11 Marelli et al (2014) garded as the sentence representation for the fast training speed, it is more reasonable to make use of all representations at all time steps with various pooling functions to compute a vector representations to produce high-quality sentence representations that excel the downstream tasks.…”
Section: Representation Poolingmentioning
confidence: 99%
“…WME approximates a kernel derived from WMD with a set of random documents. embeddings (Pennington et al, 2014;Wieting et al, 2015b) could also be utilized.…”
Section: Word2vec and Word Mover's Distancementioning
confidence: 99%
“…We compare WME against 10 supervised, simi-sepervised, and unsupervised methods for performing textual similarity tasks. Six supervised methods are initialized with Paragram-SL999(PSL) word vectors (Wieting et al, 2015b) and then trained on the PPDB dataset, including: 1) PARAGRAM-PHRASE (PP) (Wieting et al, 2015a) Setup. There are total 22 textual similarity datasets from STS tasks (2012-2015) (Agirre et al, 2012(Agirre et al, , 2013(Agirre et al, , 2014(Agirre et al, , 2015, SemEval 2014 Semantic Relatedness task (Xu et al, 2015), and SemEval 2015 Twitter task (Marelli et al, 2014).…”
Section: Comparisons On Textual Similarity Tasksmentioning
confidence: 99%