2020
DOI: 10.48550/arxiv.2004.14074
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…The pretraining model learns a context-dependent representation of each member of an input sentence using almost unlimited text, and it implicitly learns general syntactic semantic knowledge. It can migrate knowledge learned from the open domain to downstream tasks to improve low-resource tasks, and is also very helpful for low-resource language processing [13].…”
Section: Pretrainingmentioning
confidence: 99%
“…The pretraining model learns a context-dependent representation of each member of an input sentence using almost unlimited text, and it implicitly learns general syntactic semantic knowledge. It can migrate knowledge learned from the open domain to downstream tasks to improve low-resource tasks, and is also very helpful for low-resource language processing [13].…”
Section: Pretrainingmentioning
confidence: 99%
“…This provides the background knowledge in a novel way, as an alternative to the use of knowledge graphs mentioned at the beginning of this section. In contrast, Tamborrino et al (2020) rephrase the causal reasoning task into an "A because B" format and calculate the likelihood of each token in the input by masking one at a time, leading to much higher performance when compared to the use of the same model for fine-tuning on the target task.…”
Section: Related Workmentioning
confidence: 99%
“…Deep approaches aim to gain more information about a particular input, for example by using knowledge graphs to learn more about the entities and events mentioned in the input or applying formal logic to deterministically find the relation that is sought for, in order to classify that input correctly (Furbach, Gordon, and Schon 2015;Furbach and Schon 2016;Blass and Forbus 2017;Siebert, Schon, and Stolzenburg 2019;Goodwin and Demner-Fushman 2019). On the other hand, the approaches of the broad type attempt to cover a wider range of instances in their method, for example by determining the types of syntactic or semantic features that are generally used in expressing a particular relation (Gordon, Bejan, and Sagae 2011;Goodwin et al 2012;Jabeen, Gao, and Andreae 2014;Rahimtoroghi, Hernandez, and Walker 2017;Tamborrino et al 2020;Iter et al 2020). This paper attempts to tackle the causal reasoning task with two approaches, one of the deep and one of the broad type.…”
Section: Introductionmentioning
confidence: 99%
“…Information Masking Previous work [35,36,37,38] on question-answering has shown that some models tend to learn from artificial or superficial patterns of the dataset, and they can still predict the correct answer after import clues (to human) in the premise are masked. Therefore, we challenge our well-trained model, a RoBERTa with MLM fine-tuned on the original training set, by masking context, question, and both, respectively, during inference.…”
Section: Analysesmentioning
confidence: 99%