“…A thorough review of the field is provided in (Gou et al, 2021). Knowledge Distillation has been employed on various computer vision problems, i.e., image classification (Yalniz et al, 2019;Touvron et al, 2020;Xie et al, 2020), object detection (Li et al, 2017;Shmelkov et al, 2017;Deng et al, 2019), metric learning (Park et al, 2019;Peng et al, 2019), action recognition (Garcia et al, 2018;Thoker and Gall, 2019;Stroud et al, 2020), video classification (Zhang and Peng, 2018;Bhardwaj et al, 2019), video captioning (Pan et al, 2020;Zhang et al, 2020), and representation learning (Tavakolian et al, 2019;Piergiovanni et al, 2020).…”