Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

Yang, Lei; Yan, Zheyu; Li, Meng; Kwon, Hyoukjun; Lai, Liangzhen; Krishna, Tushar; Chandra, Vikas; Jiang, Weiwen; Shi, Yiyu

doi:10.48550/arxiv.2002.04116

Cited by 8 publications

(16 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Neural Architecture Search (NAS) has achieved state-of-the-art performance in various perceptual tasks, such as image classifications [22,23], inference security [2] and image segmentation [20].…”

Section: Neural Architecture Searchmentioning

confidence: 99%

Uncertainty Modeling of Emerging Device based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

Yan

Juan

et al. 2021

Proceedings of the 26th Asia and South Pacific Design Automation Conference

Self Cite

View full text Add to dashboard Cite

Emerging device-based Computing-in-memory (CiM) has been proved to be a promising candidate for high-energy efficiency deep neural network (DNN) computations. However, most emerging devices suffer uncertainty issues, resulting in a difference between actual data stored and the weight value it is designed to be. This leads to an accuracy drop from trained models to actually deployed platforms. In this work, we offer a thorough analysis of the effect of such uncertainties-induced changes in DNN models. To reduce the impact of device uncertainties, we propose UAE, an uncertaintyaware Neural Architecture Search scheme to identify a DNN model that is both accurate and robust against device uncertainties.

show abstract

Section: Neural Architecture Searchmentioning

confidence: 99%

Uncertainty Modeling of Emerging Device based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

Yan

Juan

et al. 2021

Proceedings of the 26th Asia and South Pacific Design Automation Conference

Self Cite

View full text Add to dashboard Cite

show abstract

“…al, [59] uses Bayesian optimization, RELEASE [7] uses RL, ATLAS [84] uses black box optimizations, some compiler design [12], [50] use profile-guided optimization to perform target-independent front-end compiler optimizations on DNNs or linear algebra computations. Some recent works use RL on HW/SW co-exploration to explore both DNN and its mapping over hardware [6], [32], [44], [88]. The problem of mapping the DNN computation graph over multiple devices (CPU/GPU/TPU [34]) has also been explored through manual heuristics [8], [72], [91] and RL [24], [46], [51].…”

Section: Related Workmentioning

confidence: 99%

ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning

Kao

Jeong

Krishna

2020

2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Self Cite

View full text Add to dashboard Cite

DNN accelerators provide efficiency by leveraging reuse of activations/weights/outputs during the DNN computations to reduce data movement from DRAM to the chip. The reuse is captured by the accelerator's dataflow. While there has been significant prior work in exploring and comparing various dataflows, the strategy for assigning on-chip hardware resources (i.e., compute and memory) given a dataflow that can optimize for performance/energy while meeting platform constraints of area/power for DNN(s) of interest is still relatively unexplored. The design-space of choices for balancing compute and memory explodes combinatorially, as we show in this work (e.g., as large as O(10 72 ) choices for running MobileNet-V2), making it infeasible to do manual-tuning via exhaustive searches. It is also difficult to come up with a specific heuristic given that different DNNs and layer types exhibit different amounts of reuse.In this paper, we propose an autonomous strategy called Con-fuciuX to find optimized HW resource assignments for a given model and dataflow style. ConfuciuX leverages a reinforcement learning method, REINFORCE, to guide the search process, leveraging a detailed HW performance cost model within the training loop to estimate rewards. We also augment the RL approach with a genetic algorithm for further fine-tuning. Con-fuciuX demonstrates the highest sample-efficiency for training compared to other techniques such as Bayesian optimization, genetic algorithm, simulated annealing, and other RL methods. It converges to the optimized hardware configuration 4.7 to 24 times faster than alternate techniques.

show abstract

“…DNN Algorithm and Accelerator Co-exploration. Exploring the networks and the corresponding accelerators in a joint manner [1,31,39,40,50,92] has shown great potential towards maximizing both accuracy and efficiency. Recent works have extended NAS to jointly search DNN accelerators in addition to DNN structures.…”

Section: Related Workmentioning

confidence: 99%

“…Recent works have extended NAS to jointly search DNN accelerators in addition to DNN structures. In particular, [1,31,40,92] conducted RL-based searches to co-explore the network structures and design parameters of an FPGA-/ASIC-based accelerator, but their RL-based methods can suffer from large search costs, limiting their scalability to handle large joint spaces. Recently, [19,50] extended differentiable NAS to network and accelerator co-search.…”

Section: Related Workmentioning

confidence: 99%

RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac Electrograms

Zhang¹,

Banta²,

Fu³

et al. 2021

Preprint

View full text Add to dashboard Cite

There exists a gap in terms of the signals provided by pacemakers (i.e., intracardiac electrogram (EGM)) and the signals doctors use (i.e., 12-lead electrocardiogram (ECG)) to diagnose abnormal rhythms. Therefore, the former, even if remotely transmitted, are not sufficient for doctors to provide a precise diagnosis, let alone make a timely intervention. To close this gap and make a heuristic step towards real-time critical intervention in instant response to irregular and infrequent ventricular rhythms, we propose a new framework dubbed RT-RCG to automatically search for (1) efficient Deep Neural Network (DNN) structures and then (2) corresponding accelerators, to enable Real-Time and high-quality Reconstruction of ECG signals from EGM signals. Specifically, RT-RCG proposes a new DNN search space tailored for ECG reconstruction from EGM signals, and incorporates a differentiable acceleration search (DAS) engine to efficiently navigate over the large and discrete accelerator design space to generate optimized accelerators. Extensive experiments and ablation studies under various settings consistently validate the effectiveness of our RT-RCG. To the best of our knowledge, RT-RCG is the first to leverage neural architecture search (NAS) to simultaneously tackle both reconstruction efficacy and efficiency.

show abstract

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

Cited by 8 publications

References 23 publications

Uncertainty Modeling of Emerging Device based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

Uncertainty Modeling of Emerging Device based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning

RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac Electrograms

Contact Info

Product

Resources

About