“…There has been significant prior work on nonautoregressive iterative methods for machine translation (Gu et al, 2018), some of which are: iterative refinement (Lee et al, 2018), insertionbased methods Chan et al, 2019a;Li and Chan, 2019), and conditional masked language models (Ghazvininejad et al, 2019(Ghazvininejad et al, , 2020b. Like insertion-based models Chan et al, 2019c), our work does not commit to a fixed target length; insertionbased models can dynamically grow the canvas size, whereas our work which relies on a latent alignment can only generate a target sequence up to a fixed maximum predetermined length. Compared to conditional masked languages models (Ghazvininejad et al, 2019(Ghazvininejad et al, , 2020b, key differences are: 1) our models do not require target length prediction, and 2) we eschew the encoderdecoder neural architecture formulation, but rather rely on the single simple decoder architecture.…”