Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.385
|View full text |Cite
|
Sign up to set email alerts
|

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Abstract: We investigate the problem of Chinese Grammatical Error Correction (CGEC) and present a new framework named Tail-to-Tail (TtT) non-autoregressive sequence prediction to address the deep issues hidden in CGEC. Considering that most tokens are correct and can be conveyed directly from source to target, and the error positions can be estimated and corrected based on the bidirectional context information, thus we employ a BERTinitialized Transformer Encoder as the backbone model to conduct information modeling and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(23 citation statements)
references
References 29 publications
0
20
0
Order By: Relevance
“…In recent years, BERT (Devlin et al, 2019) based models have dominated the research of Chinese spelling correction (Cheng et al, 2020;Zhang et al, 2020;Bao et al, 2020;Liu et al, 2021;Li and Shi, 2021;Huang et al, 2021), which follow the paradigm of non-autoregressive generation. Typically, these models generate corrected characters for all input characters in parallel, where the generated characters can be the same as the input characters.…”
Section: Id Text Correctionmentioning
confidence: 99%
“…In recent years, BERT (Devlin et al, 2019) based models have dominated the research of Chinese spelling correction (Cheng et al, 2020;Zhang et al, 2020;Bao et al, 2020;Liu et al, 2021;Li and Shi, 2021;Huang et al, 2021), which follow the paradigm of non-autoregressive generation. Typically, these models generate corrected characters for all input characters in parallel, where the generated characters can be the same as the input characters.…”
Section: Id Text Correctionmentioning
confidence: 99%
“…The inference efficiency is not only required for neural machine translation but also indispensable for many other text generation tasks [81], [100], [101]. Existing works of introducing NAR techniques into text generation tasks focus on automatic speech recognition [102], [103], [104], text summarization [105], grammatical error correction [106], [107], dialogue [108], [109]. Resembling the encountered challenge of NAT models in Section 2.…”
Section: Text Generationmentioning
confidence: 99%
“…Thus, NAR methods are more feasible for this task. Li et al [107] focus on the variable-length correction scenario for Chinese GEC. They employ BERT to initialize the encoder and add a CRF layer on the initialized encoder, augmented by a focal loss penalty strategy to capture the target side dependency.…”
Section: Text Generationmentioning
confidence: 99%
“…Sequence labeling methods are widely used for CGED, such as feature-based statistical models (Chang et al, 2012), and neural models (Fu et al, 2018). Due to the effectiveness of BERT (Devlin et al, 2019) in many other NLP applications, recent studies adopt BERT as the basic architecture of CGED models (Fang et al, 2020;Wang et al, 2020b;Li and Shi, 2021). Wang et al (2020b) propose a model that combines ResNet and BERT to achieve state-ofthe-art results on the CGED-2020 task.…”
Section: Related Workmentioning
confidence: 99%
“…Wang et al (2020b) propose a model that combines ResNet and BERT to achieve state-ofthe-art results on the CGED-2020 task. Li and Shi (2021) apply a CRF layer on BERT to introduce the dependency of tokens. However, neural models usually require a large amount of training data, and manually annotating a large corpus is expensive and time-consuming.…”
Section: Related Workmentioning
confidence: 99%