Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications 2019
DOI: 10.18653/v1/w19-4427
|View full text |Cite
|
Sign up to set email alerts
|

Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data

Abstract: Considerable effort has been made to address the data sparsity problem in neural grammatical error correction. In this work, we propose a simple and surprisingly effective unsupervised synthetic error generation method based on confusion sets extracted from a spellchecker to increase the amount of training data. Synthetic data is used to pre-train a Transformer sequence-to-sequence model, which not only improves over a strong baseline trained on authentic error-annotated data, but also enables the development … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
133
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 146 publications
(161 citation statements)
references
References 37 publications
2
133
0
Order By: Relevance
“…We evaluate the performance of PRETLARGE on test sets and compare the scores with the current top models. Table 5 shows a remarkable result, that is, Grundkiewicz et al (2019).…”
Section: Comparison With Current Top Modelsmentioning
confidence: 95%
“…We evaluate the performance of PRETLARGE on test sets and compare the scores with the current top models. Table 5 shows a remarkable result, that is, Grundkiewicz et al (2019).…”
Section: Comparison With Current Top Modelsmentioning
confidence: 95%
“…The ensembles combine the four models from the preceding row. (Grundkiewicz et al, 2019) 69.5 64.2 61.2 (Kiyono et al, 2019) 70.2 65.0 61.4 (Lichtarge et al, 2019) -60.4 63.3 (Xu et al, 2019) 66.6 63.2 62.6 (Omelianchuk et al, 2020) 73.7 66.5 this work -unscored 71.9 65.3 64.7 this work -scored 73.0 66.8 64.9 ment, as seen in the example-level analysis in Section 7. Other methods for scoring individual examples should be explored.…”
Section: Future Workmentioning
confidence: 99%
“…A considerable disadvantage of this approach is that NMT-based systems require an enormous amount of training data to achieve good results, while the availability of parallel correction data is limited in many languages. The current leading methods for English GEC both rely on pre-training models with a large amount of artificially generated data (Grundkiewicz et al, 2019;Kiyono et al, 2019). In this work, we aim to avoid this issue by combining several different models that perform corrections in different ways.…”
Section: Introductionmentioning
confidence: 99%