“…Extensive efforts have been made to improve the inference efficiency of deep CNNs in recent years. Popular approaches include efficient architecture design [30,28,15,40], network pruning [8,23,26], weight quantiza- * First two authors contributed equally † Corresponding author tion [4,8,17] and adaptive inference [7,14,2,35,6,34]. Among them, adaptive inference is gaining increasing attention recently, due to its remarkable advantages.…”