Deep learning has been successfully used for computer vision tasks, but its high computational cost limits the adoption in lightweight devices such as camera sensors. For this reason, many low-latency vision systems offload the inference computation to a local server, requiring fast (de)compression of the source images. Texture compression is a compelling alternative to existing compression schemes, such as JPEG or HEVC, due to its low decoding overhead, straightforward parallelization, robustness, and a fixed compression ratio. In this paper, we study the impact of lightweight bounding box-based texture compression algorithms, BC1 and YCoCg-BC3, on the accuracy of two computer vision tasks: object detection and semantic segmentation. While JPEG achieves superior per-pixel error rate, the YCoCg-BC3 encoding can provide comparable vision accuracy. The BC1 encoding results in significant degradation of vision performance. However, by retraining the FasterSeg teacher network with a BC1-compressed dataset, we reduced its segmentation mIoU loss from 2.7 to 0.5 percent. Thus, both BC1 and YCoCg-BC3 encoders are suitable for use in low latency vision systems, since they both achieve significantly higher encoding speed than JPEG and their decoding overhead is negligible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.