Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 2: Short Papers) 2017
DOI: 10.18653/v1/p17-2059
|View full text |Cite
|
Sign up to set email alerts
|

A Deep Network with Visual Text Composition Behavior

Abstract: While natural languages are compositional, how state-of-the-art neural models achieve compositionality is still unclear. We propose a deep network, which not only achieves competitive accuracy for text classification, but also exhibits compositional behavior. That is, while creating hierarchical representations of a piece of text, such as a sentence, the lower layers of the network distribute their layer-specific attention weights to individual words. In contrast, the higher layers compose meaningful phrases a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 16 publications
(20 reference statements)
0
5
0
Order By: Relevance
“…As discussed in §1, our method is inspired by the approach of Schwartz et al (2020); Xin et al (2020a), where they preempt computation if the softmax value of any early classifier is above a predefined threshold. Unlike our approach, however, their model is not guaranteed to be accurate-even after softmax calibration (Guo, 2017). Several approaches to early exiting also include fine-tuning stages to improve efficiency (Liu et al, 2020;Geng et al, 2021;.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…As discussed in §1, our method is inspired by the approach of Schwartz et al (2020); Xin et al (2020a), where they preempt computation if the softmax value of any early classifier is above a predefined threshold. Unlike our approach, however, their model is not guaranteed to be accurate-even after softmax calibration (Guo, 2017). Several approaches to early exiting also include fine-tuning stages to improve efficiency (Liu et al, 2020;Geng et al, 2021;.…”
Section: Related Workmentioning
confidence: 99%
“…Following Schwartz et al (2020), we exit on the first layer where p max k ≥ 1 − , where p max k denotes the maximum softmax response of our early classifier. Softmax values are calibrated using temperature scaling (Guo, 2017) on another held-out data split, D scale .…”
Section: Baselinesmentioning
confidence: 99%
“…Following Schwartz et al (2020), we exit on the first layer where p max k ≥ 1 − , where p max k denotes the maximum softmax response of our early classifier. Softmax values are calibrated using temperature scaling (Guo, 2017) on another held-out (labeled) data split, D scale .…”
Section: Baselinesmentioning
confidence: 99%
“…As (1) involves population quantities, we usually adopt empirical approximations (Guo, 2017) to estimate the calibration error. Specifically, we partition all data points into M bins of equal size according to their prediction confidences.…”
Section: Preliminariesmentioning
confidence: 99%
“…• Temperature Scaling (TS) (Guo, 2017) is a postprocessing calibration method that learns a single parameter to rescale the logits on the development set after the model is fine-tuned.…”
Section: Baselinesmentioning
confidence: 99%