Convolutional Neural Networks (CNNs) are becoming a fundamental tool for machine learning. High performance and energy efficiency are of great importance for deployments of CNNs in many embedded applications. Energy consumption during CNN processing is dominated by memory access and since large networks do not fit in on-chip storage, they require expensive DRAM access. This paper introduces an universal Output Stream Manager (OSM) which can be used to compress and format data coming from a CNN accelerator and reduce external memory access. The OSM exploits the sparsity of data and implements two Zero-Run Length encoding algorithms and can be easily reconfigured to optimize usage for different CNN layers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.