A strong baseline for question relevancy ranking

Gonzalez, Ana Valeria; Augenstein, Isabelle; Søgaard, Anders

doi:10.18653/v1/d18-1515

Cited by 4 publications

(4 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In general, the effectiveness of these MTL baselines on the QC task is limited because there are only a small amount of QD pairs available for training. Both our method and its ablated variant outperform the Python SQL MAP nDCG MAP nDCG MTL-MLP (Gonzalez et al, 2018) MTL baselines. This shows that it may be more effective to use a data scarce task to regularize the adversarial learning of a relatively data rich task, than using those scarce data in MTL.…”

Section: Results and Analysesmentioning

confidence: 90%

“…The results are reported in Table 4. The MTL-MLP model is originally proposed to improve question-question relevance prediction by using question-comment relevance prediction as a secondary task (Gonzalez et al, 2018). It does not perform as well as MTL-DCS, which basically uses hard parameter sharing between the two tasks and does not require additional similarity feature definitions.…”

Section: Results and Analysesmentioning

confidence: 99%

“…The model is alternatively trained on both datasets. • MTL-MLP (Gonzalez et al, 2018). This recent MTL method is originally designed to rank relevant questions and question-related comments.…”

Section: Baselines and Evaluation Metricsmentioning

confidence: 99%

See 2 more Smart Citations

Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

Zhao¹,

Sun²

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Code retrieval is a key task aiming to match natural and programming languages. In this work, we propose adversarial learning for code retrieval, that is regularized by questiondescription relevance. First, we adapt a simple adversarial learning technique to generate difficult code snippets given the input question, which can help the learning of code retrieval that faces bi-modal and data-scarce challenges. Second, we propose to leverage question-description relevance to regularize adversarial learning, such that a generated code snippet should contribute more to the code retrieval training loss, only if its paired natural language description is predicted to be less relevant to the user given question. Experiments on large-scale code retrieval datasets of two programming languages show that our adversarial learning method is able to improve the performance of state-of-the-art models. Moreover, using an additional duplicate question prediction model to regularize adversarial learning further improves the performance, and this is more effective than using the duplicated questions in strong multi-task learning baselines. 1

show abstract

Section: Results and Analysesmentioning

confidence: 90%

Section: Results and Analysesmentioning

confidence: 99%

See 1 more Smart Citation

Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

Zhao¹,

Sun²

2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

show abstract

“…We approximate a point p ∈ S by specifying some error margin > 0 so that dist(p, q) ≤ (1 + )(dist(p * , q)), where p * is the real nearest neighbor. Because we use approximate search, we rerank the retrieved utterances using a feed-forward ranking model, introduced in Gonzalez et al (2018). Their ranking model is a multi-task model, which relies on simple textual similarity measures combined in a multi-layered perceptron architecture.…”

Section: Exemplar-hredmentioning

confidence: 99%

Retrieval-based Goal-Oriented Dialogue Generation

Gonzalez¹,

Augenstein²,

Søgaard³

2019

Preprint

Self Cite

View full text Add to dashboard Cite

Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue. In this work, we explore a hybrid approach to goal-oriented dialogue generation that combines retrieval from past history with a hierarchical, neural encoder-decoder architecture. We evaluate this approach in the customer support domain using the Multiwoz dataset (Budzianowski et al., 2018). We show that adding this retrieval step to a hierarchical, neural encoder-decoder architecture leads to significant improvements, including responses that are rated more appropriate and fluent by human evaluators. Finally, we compare our retrieval-based model to various semantically conditioned models explicitly using past dialog act information, and find that our proposed model is competitive with the current state of the art (Chen et al., 2019), while not requiring explicit labels about past machine acts.Preprint. Under review.

show abstract

Multi-View Domain Adapted Sentence Embeddings for Low-Resource Unsupervised Duplicate Question Detection

Poerner¹,

Schütze²

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

We address the problem of Duplicate Question Detection (DQD) in low-resource domainspecific Community Question Answering forums. Our multi-view framework MV-DASE combines an ensemble of sentence encoders via Generalized Canonical Correlation Analysis, using unlabeled data only. In our experiments, the ensemble includes generic and domain-specific averaged word embeddings, domain-finetuned BERT and the Universal Sentence Encoder. We evaluate MV-DASE on the CQADupStack corpus and on additional low-resource Stack Exchange forums. Combining the strengths of different encoders, we significantly outperform BM25, all singleview systems as well as a recent supervised domain-adversarial DQD method.

show abstract

A strong baseline for question relevancy ranking

Cited by 4 publications

References 12 publications

Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

Retrieval-based Goal-Oriented Dialogue Generation

Multi-View Domain Adapted Sentence Embeddings for Low-Resource Unsupervised Duplicate Question Detection

Contact Info

Product

Resources

About