Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1115
|View full text |Cite
|
Sign up to set email alerts
|

Self-Attentional Models for Lattice Inputs

Abstract: Lattices are an efficient and effective method to encode ambiguity of upstream systems in natural language processing tasks, for example to compactly capture multiple speech recognition hypotheses, or to represent multiple linguistic analyses. Previous work has extended recurrent neural networks to model lattice inputs and achieved improvements in various tasks, but these models suffer from very slow computation speeds. This paper extends the recently proposed paradigm of self-attention to handle lattice input… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 37 publications
(29 citation statements)
references
References 27 publications
0
29
0
Order By: Relevance
“…While the lattice RNNs perform similar to the GNN-based FTM, we found that they are often inconvenient due to increased training time as compared to the GNNs (∼8min/epoch for RNN training v/s ∼1.5min/epoch for GNN training in our experiments). As observed in [13], unique connections in lattices inhibit efficient batching of training examples during RNN training. On the other hand, in GNNs, recurrent computations are replaced by graph convolution operations and multiple lattices of different sizes and structures can be efficiently batched together using zero padding resulting in substantial training speed-up.…”
Section: Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…While the lattice RNNs perform similar to the GNN-based FTM, we found that they are often inconvenient due to increased training time as compared to the GNNs (∼8min/epoch for RNN training v/s ∼1.5min/epoch for GNN training in our experiments). As observed in [13], unique connections in lattices inhibit efficient batching of training examples during RNN training. On the other hand, in GNNs, recurrent computations are replaced by graph convolution operations and multiple lattices of different sizes and structures can be efficiently batched together using zero padding resulting in substantial training speed-up.…”
Section: Results and Analysismentioning
confidence: 99%
“…Self-attention graph neural networks (SAGNN) [14,13,11,12] model the relationship between the nodes of a graph using the self-attention mechanism instead of the predefined edges in the graph. Instead of using a fixed adjacency matrix as in GCNs, this approach uses the inner-product of the feature vectors of the lattice arcs to compute their relevance to each other.…”
Section: Self-attention Based Graph Neural Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…Mihaylov and Frank (2019) proposed a discourse-aware selfattention encoder for reading comprehension on narrative texts, where event chains, discourse relations and coreference relations are used for connecting sentences. Self-attention can be also extended to 2d-dimensions for image processing (Parmar et al, 2018) and lattice inputs (Sperber et al, 2019).…”
Section: Self-attention Mechanismmentioning
confidence: 99%
“…In the cascade approach, an ASR system transcribes the input speech signal, and this is fed to a downstream MT system that carries out the translation. The provided input to the MT step can be the 1-best hypothesis, but also n-best lists (Ng et al, 2016) or even lattices (Matusov and Ney, 2011;Sperber et al, 2019). Additional techniques can also be used to improve the performance of the pipeline by better adapting the MT system to the expected input, such as training with transcribed text (Peitz et al, 2012) or chunking (Sperber et al, 2017).…”
Section: Introductionmentioning
confidence: 99%