We propose NNStreamer, a software system that handles neural networks as filters of stream pipelines, applying the stream processing paradigm to deep neural network applications. A new trend with the wide-spread of deep neural network applications is on-device AI. It is to process neural networks on mobile devices or edge/IoT devices instead of cloud servers. Emerging privacy issues, data transmission costs, and operational costs signify the need for on-device AI, especially if we deploy a massive number of devices. NNStreamer efficiently handles neural networks with complex data stream pipelines on devices, significantly improving the overall performance with minimal effort. Besides, NNStreamer simplifies implementations and allows reusing off-the-shelf media filters directly, which reduces developmental costs significantly. We are already deploying NNStreamer for a wide range of products and platforms, including the Galaxy series and various consumer electronic devices. The experimental results suggest a reduction in developmental costs and enhanced performance of pipeline architectures and NNStreamer. It is an open-source project incubated by Linux Foundation AI & Data, available to the public and applicable to various hardware and software platforms.
Recent efforts towards mobile cloud propose to offload mobile applications to cloud servers for the improved performance and battery life of mobile devices. However, existing schemes completely ignore the costs of cloud resources by assuming that idle servers are always available for free of charge. These unrealistic assumptions make each server run only a small load to achieve the guaranteed high offload performance. Therefore, these schemes cannot be applied to real-world commercial clouds which aim to minimize the operation costs by maximizing the server throughput, and then charge users for their resource usage.In this paper, we propose CMcloud, a novel cost-effective mobile-to-cloud offloading platform, which works nicely under the real-world cloud environments. CMcloud minimizes both the server costs and the user service fee by offloading as many mobile applications to a single server as possible, while satisfying the target performance of all applications. To achieve such goals, CMcloud exploits novel architecture performance modeling and server migration techniques. Our implementation shows that CMcloud can improve the datacenter throughput by 84% over a conventional static light-load scheme (or a 2.7x higher per-socket throughput.) Alternatively, CMcloud reduces the number of service failures by 83% over a static high-load scheme, while even improving the throughput by 31%.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.