Seq2Edits: Sequence Transduction Using Span-level Edit Operations

Stahlberg, Felix; Kumar, Shankar

doi:10.18653/v1/2020.emnlp-main.418

Cited by 45 publications

(46 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The only model that shows advantages over our 9+3 model is GECToR which is developed based on the powerful pretrained mod- els (e.g., RoBERTa and XL-Net (Yang et al, 2019)) with its multi-stage training strategy. Following GECToR's recipe, we leverage the pretrained model BART to initialize a 12+2 model which proves to work well in NMT (Li et al, 2021) despite more parameters, and apply the multi-stage fine-tuning strategy used in Stahlberg and Kumar (2020). The final single model 11 with aggressive decoding achieves the state-of-the-art result -66.4 F 0.5 in the CoNLL-14 test set with a 9.6× speedup over the Transformerbig baseline.…”

Section: Resultsmentioning

confidence: 99%

“…The Transformer (Vaswani et al, 2017) has become the most popular model for Grammatical Error Correction (GEC). In practice, however, the sequenceto-sequence (seq2seq) approach has been blamed recently (Chen et al, 2020;Stahlberg and Kumar, 2020;Omelianchuk et al, 2020) for its poor inference efficiency in modern writing assistance applications (e.g., Microsoft Office Word 1 , Google Docs 2 and Grammarly 3 ) where a GEC model usually performs online inference, instead of batch inference, for proactively and incrementally checking a user's latest completed sentence to offer instantaneous feedback.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

Sun¹,

Ge²,

Wei³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

In this paper, we propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC). SAD optimizes the online inference efficiency for GEC by two innovations: 1) it aggressively decodes as many tokens as possible in parallel instead of always decoding only one token in each step to improve computational parallelism; 2) it uses a shallow decoder instead of the conventional Transformer architecture with balanced encoder-decoder depth to reduce the computational cost during inference. Experiments in both English and Chinese GEC benchmarks show that aggressive decoding could yield the same predictions as greedy decoding but with a significant speedup for online inference. Its combination with the shallow decoder could offer an even higher online inference speedup over the powerful Transformer baseline without quality loss. Not only does our approach allow a single model to achieve the state-of-the-art results in English GEC benchmarks: 66.4 F 0.5 in the CoNLL-14 and 72.9 F 0.5 in the BEA-19 test set with an almost 10× online inference speedup over the Transformer-big model, but also it is easily adapted to other languages. Our code is available at https://github.com/AutoTemp/ Shallow-Aggressive-Decoding.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

Sun¹,

Ge²,

Wei³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…summarization, grammatical error correction, sentence splitting, etc.) as a text editing task (Malmi et al, 2019;Panthaplackel et al, 2020;Stahlberg and Kumar, 2020) where target texts are reconstructed from inputs using several edit operations.…”

Section: Related Work and Discussionmentioning

confidence: 99%

NL-EDIT: Correcting Semantic Parse Errors through Natural Language Interaction

Elgohary¹,

Meek²,

Richardson³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

We study semantic parsing in an interactive setting in which users correct errors with natural language feedback. We present NL-EDIT, a model for interpreting natural language feedback in the interaction context to generate a sequence of edits that can be applied to the initial parse to correct its errors. We show that NL-EDIT can boost the accuracy of existing text-to-SQL parsers by up to 20% with only one round of correction. We analyze the limitations of the model and discuss directions for improvement and evaluation. The code and datasets used in this paper are publicly available at http://aka.ms/NLEdit.

show abstract

“…Our work also relates to recent work on sentencelevel transduction tasks, like grammatical error correction (GEC), which allows for directly predicting certain span-level edits (Stahlberg and Kumar, 2020). These edits are different from our insertion operations, requiring token-level operations except when copying from the source sentence, and are obtained, following a long line of work in GEC (Swanson and Yamangil, 2012;Xue and Hwa, 2014;Felice et al, 2016;Bryant et al, 2017), by heuristically merging token-level alignments obtained with a Damerau-Levenshtein-style algorithm (Brill and Moore, 2000).…”

Section: Related Workmentioning

confidence: 98%

Data-to-text Generation by Splicing Together Nearest Neighbors

Wiseman¹,

Bačkurs²,

Stratos³

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" sourcetarget pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.

show abstract

Seq2Edits: Sequence Transduction Using Span-level Edit Operations

Cited by 45 publications

References 41 publications

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

NL-EDIT: Correcting Semantic Parse Errors through Natural Language Interaction

Data-to-text Generation by Splicing Together Nearest Neighbors

Contact Info

Product

Resources

About