Findings of the Association for Computational Linguistics: EMNLP 2021 2021
DOI: 10.18653/v1/2021.findings-emnlp.419
|View full text |Cite
|
Sign up to set email alerts
|

Grammatical Error Correction with Contrastive Learning in Low Error Density Domains

Abstract: Although grammatical error correction (GEC) has achieved good performance on texts written by learners of English as a second language, performance on low error density domains where texts are written by English speakers of varying levels of proficiency can still be improved. In this paper, we propose a contrastive learning approach to encourage the GEC model to assign a higher probability to a correct sentence while reducing the probability of incorrect sentences that the model tends to generate, so as to imp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 16 publications
0
4
0
Order By: Relevance
“…Our structure realizes the best F1 scores at the detection level and identification level by a balanced precision and recall among all teams participating in the CGED 2020 task. At the detection level, we improved the F1 value by 0.47% over the state-of-the-art [27][28], and this is because we added syntactic information of the sentences, which is much richer than the POS Score and PMI Score used by the state-of-the-art method. At the identification level, we improved the F1 value by 1.23% over the state-of-the-art [23], and we think this is because the state-of-the-art method only adds ResNet on top of BERT, but we not only add rich information: syntactic information, contextual embeddings and lexical information, but also add CRF layer to improve the performance, so we can get better F1 value.…”
Section: Testing Resultsmentioning
confidence: 99%
“…Our structure realizes the best F1 scores at the detection level and identification level by a balanced precision and recall among all teams participating in the CGED 2020 task. At the detection level, we improved the F1 value by 0.47% over the state-of-the-art [27][28], and this is because we added syntactic information of the sentences, which is much richer than the POS Score and PMI Score used by the state-of-the-art method. At the identification level, we improved the F1 value by 1.23% over the state-of-the-art [23], and we think this is because the state-of-the-art method only adds ResNet on top of BERT, but we not only add rich information: syntactic information, contextual embeddings and lexical information, but also add CRF layer to improve the performance, so we can get better F1 value.…”
Section: Testing Resultsmentioning
confidence: 99%
“…We evaluate the performance of our Chinese GEC system on the NLPCC-2018 test set with the MaxMatch scorer. Following Cao et al (2021), we use the one-tailed sign test with bootstrap resampling to carry out statistical significance tests.…”
Section: Data and Model Configurationmentioning
confidence: 99%
“…They propose additional training stages that make the model consider edit type interdependence when predicting the corrections. Cao, Yang, and Ng (2021) aim to enhance model performance in low-error density domains. The augmented sentences are generated by beam search to capture wrong corrections that the model tends to make.…”
Section: Augmenting Official Datasetsmentioning
confidence: 99%
“…Other systems include Katsumata and Komachi (2020) and Rothe et al (2021), who respectively explored the effectiveness of using pre-trained BART (Lewis et al 2020) and T5 (Raffel et al 2020) as the base model for GEC; Cao, Yang, and Ng (2021) subsequently extended Katsumata and Komachi (2020) using contrastive learning (Section 5.2). Chen et al (2020a) and meanwhile both combined detection with error correction by respectively constraining the output of a GEC system based on a separate GED system and jointly training GED as an auxiliary task (Section 4.3).…”
Section: Tablementioning
confidence: 99%