Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data

Grundkiewicz, Roman; Junczys-Dowmunt, Marcin; Heafield, Kenneth

doi:10.18653/v1/w19-4427

Cited by 146 publications

(161 citation statements)

References 37 publications

Supporting

Mentioning

133

Contrasting

Order By: Relevance

“…We evaluate the performance of PRETLARGE on test sets and compare the scores with the current top models. Table 5 shows a remarkable result, that is, Grundkiewicz et al (2019).…”

Section: Comparison With Current Top Modelsmentioning

confidence: 95%

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Kiyono

Suzuki

Mita

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

114

147

View full text Add to dashboard Cite

The incorporation of pseudo data in the training of grammatical error correction models has been one of the main factors in improving the performance of such models. However, consensus is lacking on experimental configurations, namely, choosing how the pseudo data should be generated or used. In this study, these choices are investigated through extensive experiments, and state-of-the-art performance is achieved on the CoNLL-2014 test set (F 0.5 = 65.0) and the official test set of the BEA-2019 shared task (F 0.5 = 70.2) without making any modifications to the model architecture.

show abstract

“…We evaluate the performance of PRETLARGE on test sets and compare the scores with the current top models. Table 5 shows a remarkable result, that is, Grundkiewicz et al (2019).…”

Section: Comparison With Current Top Modelsmentioning

confidence: 95%

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Kiyono

Suzuki

Mita

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

114

147

View full text Add to dashboard Cite

show abstract

“…The ensembles combine the four models from the preceding row. (Grundkiewicz et al, 2019) 69.5 64.2 61.2 (Kiyono et al, 2019) 70.2 65.0 61.4 (Lichtarge et al, 2019) -60.4 63.3 (Xu et al, 2019) 66.6 63.2 62.6 (Omelianchuk et al, 2020) 73.7 66.5 this work -unscored 71.9 65.3 64.7 this work -scored 73.0 66.8 64.9 ment, as seen in the example-level analysis in Section 7. Other methods for scoring individual examples should be explored.…”

Section: Future Workmentioning

confidence: 99%

Corpora Generation for Grammatical Error Correction

Lichtarge

Alberti

Kumar

et al. 2019

Proceedings of the 2019 Conference of the North

115

105

View full text Add to dashboard Cite

Grammatical Error Correction (GEC) has been recently modeled using the sequenceto-sequence framework. However, unlike sequence transduction problems such as machine translation, GEC suffers from the lack of plentiful parallel data. We describe two approaches for generating large parallel datasets for GEC using publicly available Wikipedia data.The first method extracts sourcetarget pairs from Wikipedia edit histories with minimal filtration heuristics, while the second method introduces noise into Wikipedia sentences via round-trip translation through bridge languages. Both strategies yield similar sized parallel corpora containing around 4B tokens. We employ an iterative decoding strategy that is tailored to the loosely supervised nature of our constructed corpora. We demonstrate that neural GEC models trained using either type of corpora give similar performance. Fine-tuning these models on the Lang-8 corpus and ensembling allows us to surpass the state of the art on both the CoNLL-2014 benchmark and the JFLEG task. We provide systematic analysis that compares the two approaches to data generation and highlights the effectiveness of ensembling. * * Equal contribution. Listing order is random. Jared conducted systematic experiments to determine useful variants of the Wikipedia revisions corpus, pre-training and finetuning strategies, and iterative decoding. Chris implemented the ensemble and provided background knowledge and resources related to GEC. Shankar ran training and decoding experiments using round-trip translated data. Jared, Chris and Shankar wrote the paper. Noam identified Wikipedia revisions as a source of training data. Noam developed the heuristics for using the full Wikipedia revisions at scale and conducted initial experiments to train Transformer models for GEC. Noam and Niki provided guidance on training Transformer models using the Tensor2Tensor toolkit. Simon proposed using round-trip translations as a source for training data, and corrupting them with common errors extracted from Wikipedia revisions. Simon generated such data for this paper.

show abstract

“…A considerable disadvantage of this approach is that NMT-based systems require an enormous amount of training data to achieve good results, while the availability of parallel correction data is limited in many languages. The current leading methods for English GEC both rely on pre-training models with a large amount of artificially generated data (Grundkiewicz et al, 2019;Kiyono et al, 2019). In this work, we aim to avoid this issue by combining several different models that perform corrections in different ways.…”

Section: Introductionmentioning

confidence: 99%

Heterogeneous Recycle Generation for Chinese Grammatical Error Correction

Hinson¹,

Huang²,

Chen³

2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

Most recent works in the field of grammatical error correction (GEC) rely on neural machine translation-based models. Although these models boast impressive performance, they require a massive amount of data to properly train. Furthermore, NMT-based systems treat GEC purely as a translation task and overlook the editing aspect of it. In this work we propose a heterogeneous approach to Chinese GEC, composed of a NMT-based model, a sequence editing model, and a spell checker. Our methodology not only achieves a new state-of-the-art performance for Chinese GEC, but also does so without relying on data augmentation or GEC-specific architecture changes. We further experiment with all possible configurations of our system with respect to model composition order and number of rounds of correction. A detailed analysis of each model and their contributions to the correction process is performed by adapting the ERRANT scorer to be able to score Chinese sentences.

show abstract

Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data

Cited by 146 publications

References 37 publications

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction

Corpora Generation for Grammatical Error Correction

Heterogeneous Recycle Generation for Chinese Grammatical Error Correction

Contact Info

Product

Resources

About