Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1173
|View full text |Cite
|
Sign up to set email alerts
|

Multi-View Domain Adapted Sentence Embeddings for Low-Resource Unsupervised Duplicate Question Detection

Abstract: We address the problem of Duplicate Question Detection (DQD) in low-resource domainspecific Community Question Answering forums. Our multi-view framework MV-DASE combines an ensemble of sentence encoders via Generalized Canonical Correlation Analysis, using unlabeled data only. In our experiments, the ensemble includes generic and domain-specific averaged word embeddings, domain-finetuned BERT and the Universal Sentence Encoder. We evaluate MV-DASE on the CQADupStack corpus and on additional low-resource Stack… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 45 publications
0
13
0
Order By: Relevance
“…Current models seemingly match similar keywords or phrases of the questions and answers, often without truly understanding them in context. (Rücklé et al, 2019b), ‡ is the MICRON model (Han et al, 2019), is the BERT model in (Ma et al, 2019), and is MV-DASE (Poerner and Schütze, 2019). Table 5: A mistake of MultiCQA RBa-lg (zero-shot transfer) on AskUbuntu.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Current models seemingly match similar keywords or phrases of the questions and answers, often without truly understanding them in context. (Rücklé et al, 2019b), ‡ is the MICRON model (Han et al, 2019), is the BERT model in (Ma et al, 2019), and is MV-DASE (Poerner and Schütze, 2019). Table 5: A mistake of MultiCQA RBa-lg (zero-shot transfer) on AskUbuntu.…”
Section: Discussionmentioning
confidence: 99%
“…Shah et al (2018) use adversarial domain adaptation for duplicate question detection. Poerner and Schütze (2019) adapt the combination of different sentence embeddings to individual target domains. Rücklé et al (2019b) use weakly supervised training, self-supervised training methods, and question generation.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Semantic textual similarity (STS) measures the degree of semantic equivalence between two text snippets, based on a graded numerical value, with applications including question answering (Yadav et al, 2020), duplicate detection (Poerner and Schütze, 2019), and entity linking (Zhou et al, 2020).…”
Section: Introductionmentioning
confidence: 99%