Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection

Kaneko, Masahiro; Komachi, Mamoru

doi:10.13053/cys-23-3-3271

Cited by 20 publications

(23 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…BERT BERT can integrate information in raw corpora (BooksCorpus and English Wikipedia) while considering task-specific information contained in the target dataset. Kaneko and Komachi (2019) use BERT contextualized representation to achieve state-of-the-art results for word-based GED tasks. In addition to improving results in the GED task, BERT (Devlin et al, 2018) has been shown to be a powerful feature extractor for various other tasks.…”

Section: Sentence Embeddingmentioning

confidence: 99%

Automated Scoring of Clinical Expressive Language Evaluation Tasks

Wang

Prud’hommeaux

Asgari

et al. 2020

Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

Many clinical assessment instruments used to diagnose language impairments in children include a task in which the subject must formulate a sentence to describe an image using a specific target word. Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of appropriateness yielding a single composite score for each response. In this paper, we present a dataset consisting of non-clinically elicited responses for three related sentence formulation tasks, and we propose an approach for automatically evaluating their appropriateness. Using neural machine translation, we generate correct-incorrect sentence pairs to serve as synthetic data in order to increase the amount and diversity of training data for our scoring model. Our scoring model uses transfer learning to facilitate automatic sentence appropriateness evaluation. We further compare custom word embeddings with pre-trained contextualized embeddings serving as features for our scoring model. We find that transfer learning improves scoring accuracy, particularly when using pre-trained contextualized embeddings.

show abstract

Section: Sentence Embeddingmentioning

confidence: 99%

Automated Scoring of Clinical Expressive Language Evaluation Tasks

Wang

Prud’hommeaux

Asgari

et al. 2020

Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

show abstract

“…BERT BERT can integrate information in raw corpora (BooksCorpus and English Wikipedia) while considering task-specific information contained in the target dataset. Kaneko and Komachi (2019) use BERT contextualized representation to achieve state-of-the-art results for word-based GED tasks. In addition to improving results in the GED task, BERT has been shown to be a powerful feature extractor for various other tasks.…”

Section: Sentence Embeddingmentioning

confidence: 99%

Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

2020

View full text Add to dashboard Cite

Readability assessment aims to automatically classify text by the level appropriate for learning readers. Traditional approaches to this task utilize a variety of linguistically motivated features paired with simple machine learning models. More recent methods have improved performance by discarding these features and utilizing deep learning models. However, it is unknown whether augmenting deep learning models with linguistically motivated features would improve performance further. This paper combines these two approaches with the goal of improving overall model performance and addressing this question. Evaluating on two large readability corpora, we find that, given sufficient training data, augmenting deep learning models with linguistically motivated features does not improve state-of-the-art performance. Our results provide preliminary evidence for the hypothesis that the state-of-theart deep learning models represent linguistic features of the text related to readability. Future research on the nature of representations formed in these models can shed light on the learned features and their relations to linguistically motivated ones hypothesized in traditional approaches.

show abstract

“…The goal of their study was to predict the token-level labels on a sentence-level using the attention mechanism for zero-shot sequence labeling. Kaneko and Komachi (2019) proposed a model of applying attention to each layer of BERT for GED and achieved state-of-the-art results in wordlevel GED tasks. Our BERT model predicts grammatical quality on a sentence-level for re-ranking.…”

Section: Related Workmentioning

confidence: 99%

“…Bidirectional Encoder Representations from Transformer (BERT) (Devlin et al, 2019) can consider information of large-scale raw corpora and task specific information by fine-tuning on the target task corpora. Moreover, BERT is known to be effective in the distinction of grammatical sentences from ungrammatical sentences (Kaneko and Komachi, 2019). They proposed a grammatical error detection (GED) model based on BERT that achieved state-of-the-art results in word-level GED tasks.…”

Section: Introductionmentioning

confidence: 99%

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track

Kaneko

Hotate

Katsumata

et al. 2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Self Cite

View full text Add to dashboard Cite

We introduce our system that is submitted to the restricted track of the BEA 2019 shared task on grammatical error correction 1 (GEC). It is essential to select an appropriate hypothesis sentence from the candidates list generated by the GEC model. A re-ranker can evaluate the naturalness of a corrected sentence using language models trained on large corpora. On the other hand, these language models and language representations do not explicitly take into account the grammatical errors written by learners. Thus, it is not straightforward to utilize language representations trained from a large corpus, such as Bidirectional Encoder Representations from Transformers (BERT), in a form suitable for the learner's grammatical errors. Therefore, we propose to finetune BERT on learner corpora with grammatical errors for re-ranking. The experimental results of the W&I+LOCNESS development dataset demonstrate that re-ranking using BERT can effectively improve the correction performance.

show abstract

Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection

Cited by 20 publications

References 16 publications

Automated Scoring of Clinical Expressive Language Evaluation Tasks

Automated Scoring of Clinical Expressive Language Evaluation Tasks

Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track

Contact Info

Product

Resources

About