2021
DOI: 10.48550/arxiv.2111.12933
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ML-Decoder: Scalable and Versatile Classification Head

Abstract: In this paper, we introduce ML-Decoder, a new attentionbased classification head. ML-Decoder predicts the existence of class labels via queries, and enables better utilization of spatial data compared to global average pooling. By redesigning the decoder architecture, and using a novel group-decoding scheme, ML-Decoder is highly efficient, and can scale well to thousands of classes. Compared to using a larger backbone, ML-Decoder consistently provides a better speed-accuracy trade-off. ML-Decoder is also versa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 25 publications
1
6
0
Order By: Relevance
“…When more public ChIP-seq data is available in future, the pre-training model's decoder is able to characterize the dependence among additional TFs. From the technical perspective, we are able to redesign the decoder structure [50] to efficiently scale to thousands of labels (a.k.a. TF binding profiles) without worrying the computational burden.…”
Section: Discussionmentioning
confidence: 99%
“…When more public ChIP-seq data is available in future, the pre-training model's decoder is able to characterize the dependence among additional TFs. From the technical perspective, we are able to redesign the decoder structure [50] to efficiently scale to thousands of labels (a.k.a. TF binding profiles) without worrying the computational burden.…”
Section: Discussionmentioning
confidence: 99%
“…When more public ChIP-seq data is available in future, the pre-training model's decoder is able to characterize the dependence among additional TFs. From the technical perspective, we are able to redesign the decoder structure [49] to efficiently scale to thousands of labels (a.k.a. TF binding profiles) without much computational burden.…”
Section: Discussionmentioning
confidence: 99%
“…Having a model trained to predict labels that include nouns, attributes and verbs should provide a set of rich keywords. For this reason, the ML-Decoder [35] model which follows a Transformerbased encoder-decoder pipeline was used. A pre-trained ML-Decoder was adopted to detect the 80 COCO object labels (ML-objs) to provide comparison with the Faster R-CNN object detector results.…”
Section: B Detected Abstractmentioning
confidence: 99%