Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Li, Piji; Shi, Shuming

doi:10.18653/v1/2021.acl-long.385

Cited by 14 publications

(23 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, BERT (Devlin et al, 2019) based models have dominated the research of Chinese spelling correction (Cheng et al, 2020;Zhang et al, 2020;Bao et al, 2020;Liu et al, 2021;Li and Shi, 2021;Huang et al, 2021), which follow the paradigm of non-autoregressive generation. Typically, these models generate corrected characters for all input characters in parallel, where the generated characters can be the same as the input characters.…”

Section: Id Text Correctionmentioning

confidence: 99%

CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction

Liu¹,

Song²,

Yue³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

View full text Add to dashboard Cite

Chinese spelling correction (CSC) models detect and correct a typo in texts based on the misspelled character and its context. Recently, Bert-based models have dominated the research of Chinese spelling correction (CSC). These methods have two limitations: (1) they have poor performance on multi-typo texts. In such texts, the context of each typo contains at least one misspelled character, which brings noise information. Such noisy context leads to the declining performance on multi-typo texts. (2) they tend to overcorrect valid expressions to more frequent expressions due to the masked token recovering task of Bert. We attempt to address these limitations in this paper. To make our model robust to contextual noise brought by typos, our approach first constructs a noisy context for each training sample. Then the correction model is forced to yield similar outputs based on the noisy and original contexts. Moreover, to address the overcorrection problem, copy mechanism is incorporated to encourage our model to prefer to choose the input character when the miscorrected and input character are both valid according to the given context. Experiments are conducted on widely used benchmarks. Our model achieves superior performance against state-of-the-art methods by a remarkable gain. We release the source code and pre-trained model for further use by the community 1 .

show abstract

Section: Id Text Correctionmentioning

confidence: 99%

CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction

Liu¹,

Song²,

Yue³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

View full text Add to dashboard Cite

show abstract

“…The inference efficiency is not only required for neural machine translation but also indispensable for many other text generation tasks [81], [100], [101]. Existing works of introducing NAR techniques into text generation tasks focus on automatic speech recognition [102], [103], [104], text summarization [105], grammatical error correction [106], [107], dialogue [108], [109]. Resembling the encountered challenge of NAT models in Section 2.…”

Section: Text Generationmentioning

confidence: 99%

“…Thus, NAR methods are more feasible for this task. Li et al [107] focus on the variable-length correction scenario for Chinese GEC. They employ BERT to initialize the encoder and add a CRF layer on the initialized encoder, augmented by a focal loss penalty strategy to capture the target side dependency.…”

Section: Text Generationmentioning

confidence: 99%

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

Xiao¹,

Wu²,

Guo³

et al. 2022

Preprint

View full text Add to dashboard Cite

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, auto-regressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as dialogue generation, text summarization, grammar error correction, semantic parsing, speech synthesis, and automatic speech recognition. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, dynamic length prediction, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications. The web page of this survey is at https://github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications.

show abstract

“…Sequence labeling methods are widely used for CGED, such as feature-based statistical models (Chang et al, 2012), and neural models (Fu et al, 2018). Due to the effectiveness of BERT (Devlin et al, 2019) in many other NLP applications, recent studies adopt BERT as the basic architecture of CGED models (Fang et al, 2020;Wang et al, 2020b;Li and Shi, 2021). Wang et al (2020b) propose a model that combines ResNet and BERT to achieve state-ofthe-art results on the CGED-2020 task.…”

Section: Related Workmentioning

confidence: 99%

“…Wang et al (2020b) propose a model that combines ResNet and BERT to achieve state-ofthe-art results on the CGED-2020 task. Li and Shi (2021) apply a CRF layer on BERT to introduce the dependency of tokens. However, neural models usually require a large amount of training data, and manually annotating a large corpus is expensive and time-consuming.…”

Section: Related Workmentioning

confidence: 99%

Improving Chinese Grammatical Error Detection via Data augmentation by Conditional Error Generation

Yue¹,

Liu²,

Cai³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

View full text Add to dashboard Cite

Chinese Grammatical Error Detection(CGED) aims at detecting grammatical errors in Chinese texts. One of the main challenges for CGED is the lack of annotated data. To alleviate this problem, previous studies proposed various methods to automatically generate more training samples, which can be roughly categorized into rule-based methods and model-based methods. The rule-based methods construct erroneous sentences by directly introducing noises into original sentences. However, the introduced noises are usually context-independent, which are quite different from those made by humans. The model-based methods utilize generative models to imitate human errors. The generative model may bring too many changes to the original sentences and generate semantically ambiguous sentences, so it is difficult to detect grammatical errors in these generated sentences. In addition, generated sentences may be error-free and thus become noisy data. To handle these problems, we propose CNEG, a novel Conditional Non-Autoregressive Error Generation model for generating Chinese grammatical errors. Specifically, in order to generate a context-dependent error, we first mask a span in a correct text, then predict an erroneous span conditioned on both the masked text and the correct span. Furthermore, we filter out error-free spans by measuring their perplexities in the original sentences. Experimental results show that our proposed method achieves better performance than all compared data augmentation methods on the CGED-2018 and CGED-2020 benchmarks.

show abstract

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Cited by 14 publications

References 29 publications

CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction

CRASpell: A Contextual Typo Robust Approach to Improve Chinese Spelling Correction

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

Improving Chinese Grammatical Error Detection via Data augmentation by Conditional Error Generation

Contact Info

Product

Resources

About