Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.26
|View full text |Cite
|
Sign up to set email alerts
|

Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU

Abstract: While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either forwards and backwards (BiL-STMs) or as a whole (Transformers). We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. We test five models o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 10 publications
(23 citation statements)
references
References 45 publications
(44 reference statements)
1
18
0
Order By: Relevance
“…The recent live incremental systems fall short of the same accuracies achievable on pre-segmented transcripts, so there is a natural interest in using the best non-incremental sequence models and adapting them for incrementality. Madureira and Schlangen (2020) take up this effort in several other sequence tagging and classification tasks, showing how bidirectional encoders and Transformers can be modified to work incrementally. To reduce the impact of the partiality of the input, the models predict future content and wait for more rightward context.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The recent live incremental systems fall short of the same accuracies achievable on pre-segmented transcripts, so there is a natural interest in using the best non-incremental sequence models and adapting them for incrementality. Madureira and Schlangen (2020) take up this effort in several other sequence tagging and classification tasks, showing how bidirectional encoders and Transformers can be modified to work incrementally. To reduce the impact of the partiality of the input, the models predict future content and wait for more rightward context.…”
Section: Related Workmentioning
confidence: 99%
“…Prophecy-based decoding For our other decoding strategies, we use a 'prophecy'-based approach to predicting future word sequences, following the task of open-ended language generation, which, given an input text passage as context, is to produce text that constitutes a cohesive continuation (Holtzman et al, 2019). Inspired by (Madureira and Schlangen, 2020), using the GPT-2 language model (Radford et al, 2019), we first give each word as a left context and create a continuation until the end of an utterance to create a hypothetical complete context that satisfies the requirements of the models' non-incremental structure.…”
Section: Modifying the Decoding Proceduresmentioning
confidence: 99%
“…Zhang et al (2021) introduced an average embedding layer to avoid recalculation when using an incremental encoder, while exploiting right context through knowledge distillation. An investigation of the use of non-incremental encoders for incremental NLU in interactive systems was conducted by Madureira and Schlangen (2020). The authors employed BERT (Devlin et al, 2019) for sequence tagging and classification using restart-incrementality, a procedure with high computational cost.…”
Section: Related Workmentioning
confidence: 99%
“…LT+R+CM+D: similar to (4), but, during training, the output for the input token x t is obtained at time t + d, where d ∈ {1, 2} is the delay, following the approach in Turek et al (2020). There is evidence that additional right context improve the models' incremental performance (Baumann et al, 2011;Ma et al, 2019;Madureira and Schlangen, 2020), which results in a trade-off between providing timely output or waiting for more context to deliver more stable output.…”
Section: Modelsmentioning
confidence: 99%
See 1 more Smart Citation