Blank Language Models

Shen, Tianxiao; Quach, Victor; Barzilay, Regina; Jaakkola, Tommi S.

doi:10.18653/v1/2020.emnlp-main.420

Cited by 48 publications

(50 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, our padded MLM determines the number of tokens to insert without having to pre-specify it. In that sense, it is similar to the recently proposed Blank Language Model (Shen et al, 2020).…”

Section: Related Worksupporting

confidence: 64%

See 1 more Smart Citation

Unsupervised Text Style Transfer with Padded Masked Language Models

Malmi¹,

Severyn²,

Rothe³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

We propose MASKER, an unsupervised textediting method for style transfer. To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms of likelihood. This allows us to identify the source tokens to delete to transform the source text to match the style of the target domain. The deleted tokens are replaced with the target MLM, and by using a padded MLM variant, we avoid having to predetermine the number of inserted tokens. Our experiments on sentence fusion and sentiment transfer demonstrate that MASKER performs competitively in a fully unsupervised setting. Moreover, in lowresource settings, it improves supervised methods' accuracy by over 10 percentage points when pre-training them on silver training data generated by MASKER.

show abstract

Section: Related Worksupporting

confidence: 64%

“…model, which yields an accuracy of 98.4% on the development set (slightly higher than the CNN classifier used by Shen et al (2020) which has an accuracy of 97.7%). The Exact scores reported in the paper were computed after lowercasing the predictions and the targets.…”

Section: B Hyperparameter Settingsmentioning

confidence: 86%

Unsupervised Text Style Transfer with Padded Masked Language Models

Malmi¹,

Severyn²,

Rothe³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…Rudinger et al (2015) frame narrative cloze as a generation task and employ language models, but they only consider one infill of a fixed length. Zhu et al (2019); Shen et al (2020) infill multiple variable-length sequences, but these approaches require the masked context to be iteratively updated and reprocessed to fill in blanks one a time. In contrast, our approach appends infilled text to the context and does not require reprocessing the entire input sequence for each blank.…”

Section: Related Workmentioning

confidence: 99%

Enabling Language Models to Fill in the Blanks

Donahue¹,

Lee²,

Liang³

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

119

127

View full text Add to dashboard Cite

We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document. While infilling could enable rich functionality especially for writing assistance tools, more attention has been devoted to language modeling-a special case of infilling where text is predicted at the end of a document. In this paper, we aim to extend the capabilities of language models (LMs) to the more general task of infilling. To this end, we train (or fine-tune) off-the-shelf LMs on sequences containing the concatenation of artificially-masked text and the text which was masked. We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics. Furthermore, we show that humans have difficulty identifying sentences infilled by our approach as machinegenerated in the domain of short stories.

show abstract

“…Previous approaches have proposed alternatives to autoregressive decoding (Gu et al, 2018;Lee et al, 2018;Wang and Cho, 2019). Instead of the left-to-right autoregressive decoding, Insertion Transformer and BLM (Shen et al, 2020) generate the output sequence through insertion operations, whereas LEVT (Gu et al, 2019) additionally incorporates a deletion operation. These methods produce the output iteratively, while FELIX requires only two steps: tagging and insertion.…”

Section: Related Workmentioning

confidence: 99%

FELIX: Flexible Text Editing Through Tagging and Insertion

Mallinson¹,

Severyn²,

Malmi³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

We present FELIX -a flexible text-editing approach for generation, designed to derive maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pretraining. In contrast to conventional sequenceto-sequence (seq2seq) models, FELIX is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model (MLM). Both of these models are chosen to be non-autoregressive to guarantee faster inference. FELIX performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks:

show abstract

Blank Language Models

Cited by 48 publications

References 39 publications

Unsupervised Text Style Transfer with Padded Masked Language Models

Unsupervised Text Style Transfer with Padded Masked Language Models

Enabling Language Models to Fill in the Blanks

FELIX: Flexible Text Editing Through Tagging and Insertion

Contact Info

Product

Resources

About