Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction

Chen, Mengyun; Ge, Tao; Zhang, Xingxing; Wei, Furu; Zhou, Ming

doi:10.18653/v1/2020.emnlp-main.581

Cited by 32 publications

(24 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, GECToR outperforms baseline models on English GEC datasets, but its performance severely degenerates on the Chinese GEC task, as shown in Table 3. There also exist methods that integrate seq2seq models with sequence tagging methods into a pipeline system to improve the performance or efficiency (Chen et al 2020;Hinson, Huang, and Chen 2020), which cannot be optimized endto-end.…”

Section: Gec By Generating Editsmentioning

confidence: 99%

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

Guo

Zhang

et al. 2022

AAAI

View full text Add to dashboard Cite

The task of Grammatical Error Correction (GEC) has received remarkable attention with wide applications in Natural Language Processing (NLP) in recent years. While one of the key principles of GEC is to keep the correct parts unchanged and avoid over-correction, previous sequence-to-sequence (seq2seq) models generate results from scratch, which are not guaranteed to follow the original sentence structure and may suffer from the over-correction problem. In the meantime, the recently proposed sequence tagging models can overcome the over-correction problem by only generating edit operations, but are conditioned on human designed language-specific tagging labels. In this paper, we combine the pros and alleviate the cons of both models by proposing a novel Sequence-to-Action (S2A) module. The S2A module jointly takes the source and target sentences as input, and is able to automatically generate a token-level action sequence before predicting each token, where each action is generated from three choices named SKIP, COPY and GENerate. Then the actions are fused with the basic seq2seq framework to provide final predictions. We conduct experiments on the benchmark datasets of both English and Chinese GEC tasks. Our model consistently outperforms the seq2seq baselines, while being able to significantly alleviate the over-correction problem as well as holding better generality and diversity in the generation results compared to the sequence tagging models.

show abstract

Section: Gec By Generating Editsmentioning

confidence: 99%

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

Guo

Zhang

et al. 2022

AAAI

View full text Add to dashboard Cite

show abstract

“…We follow recent work in English GEC to conduct experiments in the restricted training setting of BEA-2019 GEC shared task (Bryant et al, 2019): We use Lang-8 Corpus of Learner English (Mizumoto et al, 2011), NUCLE (Dahlmeier et al, 2013), FCE (Yannakoudakis et al, 2011) and W&I+LOCNESS (Granger;Bryant et al, 2019) as our GEC training data. For facilitating fair comparison in the efficiency evaluation, we follow the previous studies (Omelianchuk et al, 2020;Chen et al, 2020) which conduct GEC efficiency evaluation to use CoNLL-2014(Ng et al, 2014 dataset that contains 1,312 sentences as our main test set, and evaluate the speedup as well as Max-Match (Dahlmeier and Ng, 2012) precision, recall and F 0.5 using their official evaluation scripts 4 . For validation, we use CoNLL-2013 that contains 1,381 sentences as our validation set.…”

Section: Data and Model Configurationmentioning

confidence: 99%

“…Table 5: The performance and online inference efficiency evaluation of efficient GEC models in CoNLL-14. For the models with , their performance and speedup numbers are from Chen et al (2020) who evaluate the online efficiency in the same runtime setting (e.g., GPU and runtime libraries) with ours. The underlines indicate the speedup numbers of the models are evaluated with Tensorflow based on their released codes, which are not strictly comparable here.…”

Section: Evaluation For Aggressive Decodingmentioning

confidence: 99%

“…However, its performance is not so desirable as its seq2seq counterpart despite its high efficiency. The only two previous efficient approaches that are both languageindependent and good-performing are Stahlberg and Kumar (2020) which uses span-based edit operations to correct sentences to save the time for copying unchanged tokens, and Chen et al (2020) which first identifies incorrect spans with a tagging model then only corrects these spans with a generator. However, all the approaches have to extract edit operations and even conduct token alignment in advance from the error-corrected sentence pairs for training the model.…”

Section: Related Workmentioning

confidence: 99%

“…The Transformer (Vaswani et al, 2017) has become the most popular model for Grammatical Error Correction (GEC). In practice, however, the sequenceto-sequence (seq2seq) approach has been blamed recently (Chen et al, 2020;Stahlberg and Kumar, 2020;Omelianchuk et al, 2020) for its poor inference efficiency in modern writing assistance applications (e.g., Microsoft Office Word 1 , Google Docs 2 and Grammarly 3 ) where a GEC model usually performs online inference, instead of batch inference, for proactively and incrementally checking a user's latest completed sentence to offer instantaneous feedback.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

Sun¹,

Ge²,

Wei³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

In this paper, we propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC). SAD optimizes the online inference efficiency for GEC by two innovations: 1) it aggressively decodes as many tokens as possible in parallel instead of always decoding only one token in each step to improve computational parallelism; 2) it uses a shallow decoder instead of the conventional Transformer architecture with balanced encoder-decoder depth to reduce the computational cost during inference. Experiments in both English and Chinese GEC benchmarks show that aggressive decoding could yield the same predictions as greedy decoding but with a significant speedup for online inference. Its combination with the shallow decoder could offer an even higher online inference speedup over the powerful Transformer baseline without quality loss. Not only does our approach allow a single model to achieve the state-of-the-art results in English GEC benchmarks: 66.4 F 0.5 in the CoNLL-14 and 72.9 F 0.5 in the BEA-19 test set with an almost 10× online inference speedup over the Transformer-big model, but also it is easily adapted to other languages. Our code is available at https://github.com/AutoTemp/ Shallow-Aggressive-Decoding.

show abstract

Dynamic Negative Example Construction for Grammatical Error Correction Using Contrastive Learning

Zhuang

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction

Cited by 32 publications

References 41 publications

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

Dynamic Negative Example Construction for Grammatical Error Correction Using Contrastive Learning

Contact Info

Product

Resources

About