In this paper, we investigate lossy compression of deep neural networks (DNNs) by weight quantization and lossless source coding for memory-efficient deployment. Whereas the previous work addressed non-universal scalar quantization and entropy coding of DNN weights, we for the first time introduce universal DNN compression by universal vector quantization and universal source coding. In particular, we examine universal randomized lattice quantization of DNNs, which randomizes DNN weights by uniform random dithering before lattice quantization and can perform near-optimally on any source without relying on knowledge of its probability distribution. Moreover, we present a method of fine-tuning vector quantized DNNs to recover the performance loss after quantization. Our experimental results show that the proposed universal DNN compression scheme compresses the 32-layer ResNet (trained on CIFAR-10) and the AlexNet (trained on ImageNet) with compression ratios of 47.1 and 42.5, respectively.
A practical rate-matching system for constructing rate-compatible polar codes is proposed. The proposed polar code circular buffer rate-matching is suitable for transmissions on communication channels that support hybrid automatic repeat request (HARQ) communications, as well as for flexible resourceelement rate-matching on single transmission channels. Our proposed circular buffer rate matching scheme also incorporates a bit-mapping scheme for transmission on bit-interleaved coded modulation (BICM) channels using higher order modulations. An interleaver is derived from a puncturing order obtained with a low complexity progressive puncturing search algorithm on a base code of short length, and has the flexibility to achieve any desired rate at the desired code length, through puncturing or repetition. The rate-matching scheme is implied by a two-stage polarization, for transmission at any desired code length, code rate, and modulation order, and is shown to achieve the symmetric capacity of BICM channels. Numerical results on AWGN and fast fading channels show that the rate-matched polar codes have a competitive performance when compared to the spatially-coupled quasi-cyclic LDPC codes or LTE turbo codes, while having similar rate-dematching storage and computational complexities.
In this paper, we propose deep learning solutions for the estimation of the real world depth of elements in a scene captured by two cameras with different field of views. We consider a realistic smartphone scenario, where the first field of view (FOV) is a wide FOV with 1× the optical zoom, and the second FOV is contained in the first FOV captured by a tele zoom lens with 2× the optical zoom. We refer to the problem of estimating the depth for all elements in the union of the FOVs which corresponds to the Wide FOV as 'tele-wide stereo matching'. Traditional approaches can only estimate the disparity or depth in the overlapped FOV, corresponding to the Tele FOV, using stereo matching algorithms. To benchmark this novel problem, we introduce a single-image inverse-depth estimation (SIDE) solution to estimate the disparity from the image corresponding to the union Wide FOV only. We also design a novel multitask tele-wide stereo matching deep neural network (MT-TW-SMNet), which is the first to combine the stereo matching and the single image depth tasks in one network. Moreover, we propose multiple methods for the fusion between the above networks. For example, we have input feature fusion, that utilizes the disparity estimated by stereo-matching as an additional input feature for SIDE. We also designed networks for decision fusion, that deploys a stacked hour glass (SHG) network for fusion and refinement of the disparity maps from both the SIDE network and the MT-TW-SMNet. These fusion schemes significantly improve the accuracy. Experimental results on KITTI and SceneFlow datasets demonstrate that our proposed approaches provide a reasonable solution to the tele-wide stereo matching problem. We demonstrate the effectiveness of our solutions in generating the Bokeh effect on the full Wide FOV.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.