Video analytics systems based on deep learning models are often opaque and brittle and require explanation systems to help users debug. Current model explanation system are very good at giving literal explanations of behavior in terms of pixel contributions but cannot integrate information about the physical or systems processes that might influence a prediction. This paper introduces the idea that a simple form of causal reasoning, called a regression discontinuity design, can be used to associate changes in multiple key performance indicators to physical real world phenomena to give users a more actionable set of video analytics explanations. We overview the system architecture and describe a vision of the impact that such a system might have.
Activity recognition using video data is widely adopted for elder care, monitoring for safety and security, and home automation. Unfortunately, using video data as the basis for activity recognition can be brittle, since models trained on video are often not robust to certain environmental changes, such as camera angle and lighting changes. There has been a proliferation of network-connected devices in home environments. Interactions with these smart devices are associated with network activity, making network data a potential source for recognizing these device interactions. This paper advocates for the synthesis of video and network data for robust interaction recognition in connected environments. We consider machine learning-based approaches for activity recognition, where each labeled activity is associated with both a video capture and an accompanying network traffic trace. We develop a simple but effective framework AMIR (Active Multimodal Interaction Recognition)1 that trains independent models for video and network activity recognition respectively, and subsequently combines the predictions from these models using a meta-learning framework. Whether in lab or at home, this approach reduces the amount of "paired" demonstrations needed to perform accurate activity recognition, where both network and video data are collected simultaneously. Specifically, the method we have developed requires up to 70.83% fewer samples to achieve 85% F1 score than random data collection, and improves accuracy by 17.76% given the same number of samples.
Recent advances in computer architecture and networking have ushered in a new age of edge computing, where computation is placed close to the point of data collection to facilitate low-latency decision making. As the complexity of such deployments grow into networks of interconnected edge devices, getting the necessary data to be in "the right place at the right time" can become a challenge. We envision a future of edge analytics where data flows between edge nodes are declaratively configured through high-level constraints. Using machine learning model-serving as a prototypical task, we illustrate how the heterogeneity and specialization of edge devices can lead to complex, task-specific communication patterns even in relatively simple situations. Without a declarative framework, managing this complexity will be challenging for developers and will lead to brittle systems. We conclude with a research vision for database community that brings our perspective to the emergent area of edge computing.
Due to latency and privacy concerns, we are witnessing the rise of edge computing, where computation is placed close to the point of data collection to facilitate low-latency decision making. However, we believe that a very important class of sensor fusion applications, in which data generated in a disaggregated way has to be combined to make a decision, are not well understood in the context of edge computing. The necessary data needs to be in "the right place at the right time", making intra-edge communication a significant bottleneck. In prior work, we proposed an edge-based model serving system, called EdgeServe, that not only manages a machine learning inference service, but also orchestrates data movement between nodes on an edge network. In this paper, we evaluate trade-offs in temporal synchronization between data sources, and present initial experiments that study how different knobs can affect the performance of sensor fusion applications. CCS CONCEPTS• Computer systems organization → Distributed architectures; Sensor networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.