Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.191
|View full text |Cite
|
Sign up to set email alerts
|

RankNAS: Efficient Neural Architecture Search by Pairwise Ranking

Abstract: This paper addresses the efficiency challenge of Neural Architecture Search (NAS) by formulating the task as a ranking problem. Previous methods require numerous training examples to estimate the accurate performance of architectures, although the actual goal is to find the distinction between "good" and "bad" candidates. Here we do not resort to performance predictors. Instead, we propose a performance ranking method (RankNAS) via pairwise ranking. It enables efficient architecture search using much fewer tra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(18 citation statements)
references
References 27 publications
0
18
0
Order By: Relevance
“…To automate the network design, we take advantage of neural architecture search (NAS) [11,23]. It has been widely employed in searching standard Transformer architectures in Natural Language Processing [14,16,29,36,38] and Computer Vision [5,6,10,13]. These studies mainly focus on refining search space and/or improving search algorithms.…”
Section: Softmax Attention or Linear Attentionmentioning
confidence: 99%
See 2 more Smart Citations
“…To automate the network design, we take advantage of neural architecture search (NAS) [11,23]. It has been widely employed in searching standard Transformer architectures in Natural Language Processing [14,16,29,36,38] and Computer Vision [5,6,10,13]. These studies mainly focus on refining search space and/or improving search algorithms.…”
Section: Softmax Attention or Linear Attentionmentioning
confidence: 99%
“…However, these methods suffer from long training and large search costs because all the candidates need to be optimized, evaluated, and ranked. For the purpose of lowering these costs, we utilize RankNAS [16], a new efficient NAS framework for searching the standard Transformer [35]. It can significantly speed up the search procedure through pairwise ranking, search space pruning, and the hardware-aware constraint.…”
Section: Softmax Attention or Linear Attentionmentioning
confidence: 99%
See 1 more Smart Citation
“…Hence, we accelerate the decoding by reducing the number of decoder layers and removing multi-head mechanism 3 . Inspired by Hu et al (2021), we design the lightweight Transformer student model with one decoder layer. We further remove the multi-head mechanism in the decoder's attention modules.…”
Section: Lightweight Transformer Student Modelsmentioning
confidence: 99%
“…Large and deep Transformer models have dominated machine translation (MT) tasks in recent years (Vaswani et al, 2017;Edunov et al, 2018;Raffel et al, 2020). Despite their high accuracy, these models are inefficient and difficult to deploy (Wang et al, 2020a;Hu et al, 2021;. Many efforts have been made to improve the translation efficiency, including efficient architectures (Li et al, 2021a,b), quantization (Bhandare et al, 2019;, and knowledge distillation Lin et al, 2021a).…”
Section: Introductionmentioning
confidence: 99%