Wearables and mobile devices see the world through the lens of half a dozen low-power sensors, such as, barometers, accelerometers, microphones and proximity detectors. But differences between sensors ranging from sampling rates, discrete and continuous data or even the data type itself make principled approaches to integrating these streams challenging. How, for example, is barometric pressure best combined with an audio sample to infer if a user is in a car, plane or bike? Critically for applications, how successfully sensor devices are able to maximize the information contained across these multi-modal sensor streams often dictates the fidelity at which they can track user behaviors and context changes. This paper studies the benefits of adopting deep learning algorithms for interpreting user activity and context as captured by multi-sensor systems. Specifically, we focus on four variations of deep neural networks that are based either on fully-connected Deep Neural Networks (DNNs) or Convolutional Neural Networks (CNNs). Two of these architectures follow conventional deep models by performing feature representation learning from a concatenation of sensor types. This classic approach is contrasted with a promising deep model variant characterized by modality-specific partitions of the architecture to maximize intra-modality learning. Our exploration represents the first time these architectures have been evaluated for multimodal deep learning under wearable data -- and for convolutional layers within this architecture, it represents a novel architecture entirely. Experiments show these generic multimodal neural network models compete well with a rich variety of conventional hand-designed shallow methods (including feature extraction and classifier construction) and task-specific modeling pipelines, across a wide-range of sensor types and inference tasks (four different datasets). Although the training and inference overhead of these multimodal deep approaches is in some cases appreciable, we also demonstrate the feasibility of on-device mobile and wearable execution is not a barrier to adoption. This study is carefully constructed to focus on multimodal aspects of wearable data modeling for deep learning by providing a wide range of empirical observations, which we expect to have considerable value in the community. We summarize our observations into a series of practitioner rules-of-thumb and lessons learned that can guide the usage of multimodal deep learning for activity and context detection.
The environmental context of a mobile device determines how it is used and how the device can optimize operations for greater efficiency and usability. We consider the problem of detecting if a device is indoor or outdoor. Towards this end, we present a general method employing semi-supervised machine learning and using only the lightweight sensors on a smartphone. We find that a particular semi-supervised learning method called co-training, when suitably engineered, is most effective. It is able to automatically learn characteristics of new environments and devices, and thereby provides a detection accuracy exceeding 90% even in unfamiliar circumstances. It can learn and adapt online, in real time, at modest computational costs. Thus the method is suitable for on-device learning. Implementation of the indoor-outdoor detection service based on our method is lightweight in energy use -it can sleep when not in use and does not need to track the device state continuously. It is shown to outperform existing indoor-outdoor detection techniques that rely on static algorithms or GPS, in terms of both accuracy and energy-efficiency.
Convolutional Neural Networks (CNNs) are extremely computationally demanding, presenting a large barrier to their deployment on resource-constrained devices. Since such systems are where some of their most useful applications lie (e.g. obstacle detection for mobile robots, vision-based medical assistive technology), significant bodies of work from both machine learning and systems communities have attempted to provide optimisations that will make CNNs available to edge devices. In this paper we unify the two viewpoints in a Deep Learning Inference Stack and take an across-stack approach by implementing and evaluating the most common neural network compression techniques (weight pruning, channel pruning, and quantisation) and optimising their parallel execution with a range of programming approaches (OpenMP, OpenCL) and hardware architectures (CPU, GPU). We provide comprehensive Pareto curves to instruct trade-offs under constraints of accuracy, execution time, and memory space.
We consider the crowdsourcing based mobile cellular network measurement paradigm that is becoming increasingly popular. In particular, we aim to study the impact of user indoor/outdoor environment context at time of measurement. Focusing on signal strength as the measurement metric and using a real large crowdsourced measurement dataset for central London area along with estimated environment state (indoor or outdoor), we show that indoor-outdoor context has a significant impact, suggesting that conflating indoor and outdoor measurements can lead to unreliable results. We validate these observations using a set of diverse and controlled measurements with indoor/outdoor ground truth information. We also discuss some opportunities for future work (e.g., accurate and efficient context detection) relevant to crowdsourced mobile network measurement systems.
Recent advancements in energy-efficient hardware technology is driving the exponential growth we are experiencing in the Internet of Things (IoT) space, with more pervasive computations being performed near to data generation sources. A range of intelligent devices and applications performing local detection is emerging (activity recognition, fitness monitoring, etc.) bringing with them obvious advantages such as reducing detection latency for improved interaction with devices and safeguarding user data by not leaving the device. Video processing holds utility for many emerging applications and data labelling in the IoT space. However, performing this video processing with deep neural networks at the edge of the Internet is not trivial. In this paper we show that pedestrian location estimation using deep neural networks is achievable on fixed cameras with limited compute resources. Our approach uses pose estimation from key body points detection to extend pedestrian skeleton when whole body not in image (occluded by obstacles or partially outside of frame), which achieves better location estimation performance (infrence time and memory footprint) compared to fitting a bounding box over pedestrian and scaling. We collect a sizable dataset comprising of over 2100 frames in videos from one and two surveillance cameras pointing from different angles at the scene, and annotate each frame with the exact position of person in image, in 42 different scenarios of activity and occlusion. We compare our pose estimation based location detection with a popular detection algorithm, YOLOv2, for overlapping bounding-box generation, our solution achieving faster inference time (15x speedup) at half the memory footprint, within resource capabilities on embedded devices, which demonstrate that CamLoc is an efficient solution for location estimation in videos on smart-cameras.
WiFi in indoor environments exhibits spatiotemporal variations in terms of coverage and interference in typical WLAN deployments with multiple APs, motivating the need for automated monitoring to aid network administrators to adapt the WLAN deployment in order to match the user expectations. We develop Pazl, a mobile crowdsensing based indoor WiFi monitoring system that is enabled by a novel hybrid localization mechanism to locate individual measurements taken from participant phones. The localization mechanism in Pazl integrates the best aspects of two well known localization techniques, pedestrian dead reckoning and WiFi fingerprinting; it also relies on crowdsourcing for constructing the WiFi fingerprint database. Compared to existing WiFi monitoring systems based on static sniffers, Pazl is low cost and provides a user-side perspective. Pazl is significantly more automated than wireless site survey tools such as Ekahau Mobile Survey tool by drastically reducing the manual point-and-click based measurement location determination. We implement Pazl through a combination of Android mobile app and cloud backend application on the Google App Engine. Experimental evaluation of Pazl with a trial set of users shows that it yields similar results to manual site surveys but without the tedium.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.