Pre-Training BERT on Domain Resources for Short Answer Grading

Sung, Chul; Dhamecha, Tejas I.; Saha, Swarnadeep; Ma, Tengfei; Reddy, V. Lakshma; Arora, Rishi

doi:10.18653/v1/d19-1628

Cited by 63 publications

(31 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As a rule of thumb to fine-tune BERT for downstream tasks, Devlin et al ( 2019) suggested a minimal hyperparameter tuning strategy relying on a gridsearch on the following ranges: learning rate ∈ {2e−5, 3e−5, 4e−5, 5e−5}, number of training epochs ∈ {3, 4}, batch size ∈ {16, 32} and fixed dropout rate of 0.1. These not well justified suggestions are blindly followed in the literature Alsentzer et al, 2019;Beltagy et al, 2019;Sung et al, 2019). Given the relatively small size of the datasets, we use batch sizes ∈ {4, 8, 16, 32}.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

LEGAL-BERT: The Muppets straight out of Law School

Chalkidis

Fergadiotis

Malakasiotis

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

313

118

View full text Add to dashboard Cite

BERT has achieved impressive performance in several NLP tasks. However, there has been limited investigation on its adaptation guidelines in specialised domains. Here we focus on the legal domain, where we explore several approaches for applying BERT models to downstream legal tasks, evaluating on multiple datasets. Our findings indicate that the previous guidelines for pre-training and finetuning, often blindly followed, do not always generalize well in the legal domain. Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains. These are: (a) use the original BERT out of the box, (b) adapt BERT by additional pre-training on domain-specific corpora, and (c) pre-train BERT from scratch on domain-specific corpora. We also propose a broader hyper-parameter search space when fine-tuning for downstream tasks and we release LEGAL-BERT, a family of BERT models intended to assist legal NLP research, computational law, and legal technology applications.

show abstract

Section: Methodsmentioning

confidence: 99%

“…Improvements were reported in downstream tasks in both cases. Sung et al (2019) further pre-trained BERT-BASE on textbooks and question-answer pairs to improve short answer grading for intelligent tutoring systems.…”

Section: Related Workmentioning

confidence: 99%

LEGAL-BERT: The Muppets straight out of Law School

Chalkidis

Fergadiotis

Malakasiotis

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

313

118

View full text Add to dashboard Cite

show abstract

“…For our experimentation, we have used SciBert (Beltagy et al, 2019b) to get the sentence embeddings as it is trained on a large multi-domain corpus of scientific publications to improve performance on many scientific NLP tasks like summarization (Gabriel et al, 2019) and relation extraction (Sung et al, 2019). For convolution layer, we have used 600 filters, and 3 kernels with ReLU as our activation function.…”

Section: Methodsmentioning

confidence: 99%

IIITBH-IITP@CL-SciSumm20, CL-LaySumm20, LongSumm20

Reddy¹,

Saini

Saha

et al. 2020

Proceedings of the First Workshop on Scholarly Document Processing

View full text Add to dashboard Cite

In this paper, we present the IIIT Bhagalpur and IIT Patna team's effort to solve the three shared tasks namely, CL-SciSumm 2020, CL-LaySumm 2020, LongSumm 2020 at SDP 2020. The themes of these tasks are to generate medium-scale, lay and long summaries, respectively, for scientific articles. For the first two tasks, unsupervised systems are developed, while for the third one, we have developed a supervised system. The performances of all the systems are evaluated on the associated datasets with the shared tasks in term of well-known ROUGE metric.

show abstract

“…al. [32] have shown that by updating the pre-trained BERT language model with domainspecific books and question-answer data, better results can be achieved instead of fine-tuning the model.…”

Section: Related Workmentioning

confidence: 99%

Automatic Short Answer Grading With SemSpace Sense Vectors and MaLSTM

2021

View full text Add to dashboard Cite

Automatic assessment of exams is widely preferred by educators than multiple-choice exams because of its efficiency in measuring student performance, lack of subjectivity when evaluating student response, and faster evaluation time than the time consuming manual evaluation. In this study, a new approach for the Automatic Short Answer Grading (ASAG) is proposed using MaLSTM and the sense vectors obtained by SemSpace, a synset based sense embedding method built leveraging WordNet. Synset representations of the Student's answers and reference answers are given as input into parallel LSTM architecture, they are transformed into sentence representations in the hidden layer and the vectorial similarity of these two representation vectors are computed with Manhattan Similarity in the output layer. The proposed approach has been tested using the Mohler ASAG dataset and successful results are obtained in terms of Pearson (r) correlation and RMSE. Also, the proposed approach has been tested as a case study using a specific dataset (CU-NLP) created from the exam of the ''Natural Language Processing'' course in the Computer Engineering Department of Cukurova University. And it has achieved a successful correlation. The results obtained in the experiments show that the proposed system can be used efficiently and effectively in context-dependent ASAG tasks.

show abstract

Pre-Training BERT on Domain Resources for Short Answer Grading

Cited by 63 publications

References 12 publications

LEGAL-BERT: The Muppets straight out of Law School

LEGAL-BERT: The Muppets straight out of Law School

IIITBH-IITP@CL-SciSumm20, CL-LaySumm20, LongSumm20

Automatic Short Answer Grading With SemSpace Sense Vectors and MaLSTM

Contact Info

Product

Resources

About