Figure 1: Using whatever mobile devices a user has with them, IMUPoser estimates full-body pose. In the best case, a user can have a smartphone, smartwatch and earbuds (pose from 3 devices). Of course, the number of devices will vary over time, e.g., earbud use is intermittent and not everyone wears a smartwatch. This means IMUPoser must track what devices are present, where they are located, and use whatever IMU data is available.
Despite advances in audio- and motion-based human activity recognition (HAR) systems, a practical, power-efficient, and privacy-sensitive activity recognition system has remained elusive. State-of-the-art activity recognition systems often require power-hungry and privacy-invasive audio data. This is especially challenging for resource-constrained wearables, such as smartwatches. To counter the need for an always-on audio-based activity classification system, we first make use of power and compute-optimized IMUs sampled at 50 Hz to act as a trigger for detecting activity events. Once detected, we use a multimodal deep learning model that augments the motion data with audio data captured on a smartwatch. We subsample this audio to rates ≤ 1 kHz, rendering spoken content unintelligible, while also reducing power consumption on mobile devices. Our multimodal deep learning model achieves a recognition accuracy of 92.2% across 26 daily activities in four indoor environments. Our findings show that subsampling audio from 16 kHz down to 1 kHz, in concert with motion data, does not result in a significant drop in inference accuracy. We also analyze the speech content intelligibility and power requirements of audio sampled at less than 1 kHz and demonstrate that our proposed approach can improve the practicality of human activity recognition systems.
A user often needs training and guidance while performing several daily life procedures, e.g., cooking, setting up a new appliance, or doing a COVID test. Watch-based human activity recognition (HAR) can track users' actions during these procedures. However, out of the box, state-of-the-art HAR struggles from noisy data and less-expressive actions that are often part of daily life tasks. This paper proposes PrISM-Tracker, a procedure-tracking framework that augments existing HAR models with (1) graph-based procedure representation and (2) a user-interaction module to handle model uncertainty. Specifically, PrISM-Tracker extends a Viterbi algorithm to update state probabilities based on time-series HAR outputs by leveraging the graph representation that embeds time information as prior. Moreover, the model identifies moments or classes of uncertainty and asks the user for guidance to improve tracking accuracy. We tested PrISM-Tracker in two procedures: latte-making in an engineering lab study and wound care for skin cancer patients at a clinic. The results showed the effectiveness of the proposed algorithm utilizing transition graphs in tracking steps and the efficacy of using simulated human input to enhance performance. This work is the first step toward human-in-the-loop intelligent systems for guiding users while performing new and complicated procedural tasks.
Gestural interaction with freehands and while grasping an everyday object enables always-available input . To sense such gestures, minimal instrumentation of the user’s hand is desirable. However, the choice of an effective but minimal IMU layout remains challenging, due to the complexity of the multi-factorial space that comprises diverse finger gestures, objects and grasps. We present SparseIMU , a rapid method for selecting minimal inertial sensor-based layouts for effective gesture recognition. Furthermore, we contribute a computational tool to guide designers with optimal sensor placement. Our approach builds on an extensive microgestures dataset that we collected with a dense network of 17 inertial measurement units (IMUs). We performed a series of analyses, including an evaluation of the entire combinatorial space for freehand and grasping microgestures (393K layouts), and quantified the performance across different layout choices, revealing new gesture detection opportunities with IMUs. Finally, we demonstrate the versatility of our method with four scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.