Adjoined Networks: A Training Paradigm with Applications to Network Compression

Nath, Utkarsh; Kushagra, Shrinu; Yang, Yingzhen

doi:10.48550/arxiv.2006.05624

Cited by 1 publication

(1 citation statement)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It would take them more than 3000 GPU days to achieve state-of-the-art performance on the Ima-geNet dataset. Most recent studies [2,29,31,53,55] encode architectures as a weight-sharing super-net and optimize the weights using gradient descent. Architectures found by NAS exhibit two significant advantages.…”

Section: Introductionmentioning

confidence: 99%

RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation

Nath¹,

Wang²,

Yang³

2023

Preprint

View full text Add to dashboard Cite

Deep Neural Networks are vulnerable to adversarial attacks. Neural Architecture Search (NAS), one of the driving tools of deep neural networks, demonstrates superior performance in prediction accuracy in various machine learning applications. However, it is unclear how it performs against adversarial attacks. Given the presence of a robust teacher, it would be interesting to investigate if NAS would produce robust neural architecture by inheriting robustness from the teacher. In this paper, we propose Robust Neural Architecture Search by Cross-Layer Knowledge Distillation (RNAS-CL), a novel NAS algorithm that improves the robustness of NAS by learning from a robust teacher through cross-layer knowledge distillation. Unlike previous knowledge distillation methods that encourage close student/teacher output only in the last layer, RNAS-CL automatically searches for the best teacher layer to supervise each student layer. Experimental result evidences the effectiveness of RNAS-CL and shows that RNAS-CL produces small and robust neural architecture.

show abstract

Section: Introductionmentioning

confidence: 99%