Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.237
|View full text |Cite
|
Sign up to set email alerts
|

A Semantic-based Method for Unsupervised Commonsense Question Answering

Abstract: Unsupervised commonsense question answering is appealing since it does not rely on any labeled task data. Among existing work, a popular solution is to use pre-trained language models to score candidate choices directly conditioned on the question or context. However, such scores from language models can be easily affected by irrelevant factors, such as word frequencies, sentence structures, etc. These distracting factors may not only mislead the model to choose a wrong answer but also make it oversensitive to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 25 publications
(44 reference statements)
1
10
0
Order By: Relevance
“…Therefore, we report the experimental results on their development sets for a fair comparison (Shwartz et al, 2020). For CoPA that only provides development and test sets, we follow Niu et al (2021) to train models on the development set and evaluate the performance on the test set. For commonsense KG, we adopt Con-ceptNet (Speer et al, 2017), a general-domain and task-agnostic CSKG, as our external knowledge source G for all the above models and tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, we report the experimental results on their development sets for a fair comparison (Shwartz et al, 2020). For CoPA that only provides development and test sets, we follow Niu et al (2021) to train models on the development set and evaluate the performance on the test set. For commonsense KG, we adopt Con-ceptNet (Speer et al, 2017), a general-domain and task-agnostic CSKG, as our external knowledge source G for all the above models and tasks.…”
Section: Methodsmentioning
confidence: 99%
“…This is computed as the conditional probability of the answer given a domain-specific prefix such as "The sentiment of the movie is" for sentiment analysis or "The answer is" for general QA tasks. SEQA (Niu et al, 2021) mitigates the sensitivity to word choice by generating answers using GPT-2, and selecting the answer choice most similar to the generated answers.…”
Section: Plausibility Scoringmentioning
confidence: 99%
“…In multiple-choice question answering (MCQA) tasks, zero-shot methods typically rely on the language model (LM) probabilities as a proxy for plausibility, predicting the answer choice with the highest probability conditioned on the question. LM score is a naïve proxy for plausibility, since it confounds factors such as length, unigram frequency, and more (Holtzman et al, 2021;Niu et al, 2021). Indeed, in Figure 1, a GPT-2 based LM score incorrectly predicts that the woman hired a lawyer because she decided to run for office, rather than because she decided to sue her employer.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Semantic similarity. Niu et al (2021) show that semantic similarity matching can be used to make PLMs robust against irrelevant factors such as word frequencies. Specifically, we can first use PLMs to generate plausible answers so that we can compute the similarity between each generated answer and each of the provided answer candidates, and then we can select the answer candidate that has the highest similarity score as the correct answer.…”
Section: Linguistic Reasoningmentioning
confidence: 99%