“…To meet this rapidly increasing demand for AI capabilities on embedded systems, such as autonomous vehicles, drones, and medical devices, prior research focused on various techniques for reducing the power and energy consumption of NNs deployed on hardware accelerators, These techniques include network compression [Lebedev et al, 2015, Ullrich et al, 2017, Chmiel et al, 2020, Baskin et al, 2021a, pruning [Han et al, 2015, neural architecture search [Liu et al, 2019, Wu et al, 2019, Cai et al, 2019, and quantization [Zhou et al, 2016, Hubara et al, 2018.…”