Xunquan Chen scite author profile

Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

show abstract

Binary Attribute Embeddings for Zero-Shot Sound Event Classification

Lin

Chen

Takashima

et al. 2022

View full text Add to dashboard Cite

Speaker-Independent Emotional Voice Conversion via Disentangled Representations

Chen

et al. 2023

IEEE Trans. Multimedia

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xunquan Chen

Zero-Shot Sound Event Classification Using a Sound Attribute Vector with Global and Local Feature Learning

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based Input Tiling

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based InputTiling

Binary Attribute Embeddings for Zero-Shot Sound Event Classification

Speaker-Independent Emotional Voice Conversion via Disentangled Representations

Contact Info

Product

Resources

About