2019
DOI: 10.48550/arxiv.1902.03249
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

Mitchell Stern,
William Chan,
Jamie Kiros
et al.
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
39
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(39 citation statements)
references
References 0 publications
0
39
0
Order By: Relevance
“…Future work, might consider selecting more than one elements from N (x t ) to expand at each time, instead of a single i t . This will amortize the encoding cost among multiple non-terminal expansions, similar to Welleck et al [37], Stern et al [32].…”
Section: Practical Considerationsmentioning
confidence: 98%
See 1 more Smart Citation
“…Future work, might consider selecting more than one elements from N (x t ) to expand at each time, instead of a single i t . This will amortize the encoding cost among multiple non-terminal expansions, similar to Welleck et al [37], Stern et al [32].…”
Section: Practical Considerationsmentioning
confidence: 98%
“…However, for general-purpose programming language such a method is prohibitive. Recently, sequence generation approaches that go beyond the left-to-right paradigm have been proposed [37,32,11,10,17,30], usually by considering generation as an iterative refinement procedure that changes or extends a sequence in every iteration. These models often aim in speeding-up inference or allowing models to figure a better order for generating a full sentence (of terminal tokens).…”
Section: Related Workmentioning
confidence: 99%
“…There has been extensive interest in non-autoregressive/parallel generation approaches, aiming at producing a sequence in parallel sub-linear time w.r.t. sequence length [13,52,26,65,53,14,11,12,48,15,28,16,49,55,30,41,64,62]. Existing approaches can be broadly classified as latent variable based [13,26,65,28,41], refinement-based [25,48,14,15,11,30,12,62] or a combination of both [41].…”
Section: Related Workmentioning
confidence: 99%
“…sequence length [13,52,26,65,53,14,11,12,48,15,28,16,49,55,30,41,64,62]. Existing approaches can be broadly classified as latent variable based [13,26,65,28,41], refinement-based [25,48,14,15,11,30,12,62] or a combination of both [41].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation