Supervised training of human activity recognition (HAR) systems based on body-worn inertial measurement units (IMUs) is often constrained by the typically rather small amounts of labeled sample data. Systems like IMUTube have been introduced that employ cross-modality transfer approaches to convert videos of activities of interest into virtual IMU data. We demonstrate for the first time how such large-scale virtual IMU datasets can be used to train HAR systems that are substantially more complex than the state-of-the-art. Complexity is thereby represented by the number of model parameters that can be trained robustly. Our models contain components that are dedicated to capture the essentials of IMU data as they are of relevance for activity recognition, which increased the number of trainable parameters by a factor of 1100 compared to state-of-the-art model architectures. We evaluate the new model architecture on the challenging task of analyzing free-weight gym exercises, specifically on classifying 13 dumbbell execises. We have collected around 41 h of virtual IMU data using IMUTube from exercise videos available from YouTube. The proposed model is trained with the large amount of virtual IMU data and calibrated with a mere 36 min of real IMU data. The trained model was evaluated on a real IMU dataset and we demonstrate the substantial performance improvements of 20% absolute F1 score compared to the state-of-the-art convolutional models in HAR.
The lack of large-scale, labeled data sets impedes progress in developing robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets. Our initial results are very promising, but the greater promise of this work lies in a collective approach by the computer vision, signal processing, and activity recognition communities to extend this work in ways that we outline. This should lead to on-body, sensor-based HAR becoming yet another success story in large-dataset breakthroughs in recognition.
Freezing of gait (FOG) is a poorly understood heterogeneous gait disorder seen in patients with parkinsonism which contributes to significant morbidity and social isolation. FOG is currently measured with scales that are typically performed by movement disorders specialists (i.e., MDS-UPDRS), or through patient completed questionnaires (N-FOG-Q) both of which are inadequate in addressing the heterogeneous nature of the disorder and are unsuitable for use in clinical trials The purpose of this study was to devise a method to measure FOG objectively, hence improving our ability to identify it and accurately evaluate new therapies. A major innovation of our study is that it is the first study of its kind that uses the largest sample size (>30 h, N = 57) in order to apply explainable, multi-task deep learning models for quantifying FOG over the course of the medication cycle and at varying levels of parkinsonism severity. We trained interpretable deep learning models with multi-task learning to simultaneously score FOG (cross-validated F1 score 97.6%), identify medication state (OFF vs. ON levodopa; cross-validated F1 score 96.8%), and measure total PD severity (MDS-UPDRS-III score prediction error ≤ 2.7 points) using kinematic data of a well-characterized sample of N = 57 patients during levodopa challenge tests. The proposed model was able to explain how kinematic movements are associated with each FOG severity level that were highly consistent with the features, in which movement disorders specialists are trained to identify as characteristics of freezing. Overall, we demonstrate that deep learning models’ capability to capture complex movement patterns in kinematic data can automatically and objectively score FOG with high accuracy. These models have the potential to discover novel kinematic biomarkers for FOG that can be used for hypothesis generation and potentially as clinical trial outcome measures.
The lack of large-scale, labeled data sets impedes progress in developing robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets. Our initial results are very promising, but the greater promise of this work lies in a collective approach by the computer vision, signal processing, and activity recognition communities to extend this work in ways that we outline. This should lead to on-body, sensor-based HAR becoming yet another success story in large-dataset breakthroughs in recognition.CCS Concepts: • Human-centered computing → Ubiquitous and mobile computing; • Computing methodologies → Artificial intelligence; Supervised learning by classification.
A comprehensive understanding of collocated social interactions can help campuses and organizations better support their community. Universities could determine new ways to conduct classes and design programs by studying how students have collocated in the past. However, this needs data that describe large groups over a long period. Harnessing user devices to infer collocation, while tempting, is challenged by privacy concerns, power consumption, and maintenance issues. Alternatively, embedding new sensors across the entire campus is expensive. Instead, we investigate an easily accessible data source that can retroactively depict multiple users on campus over a semester, a managed WiFi network. Despite the coarse approximations of collocation provided by WiFi network logs, we demonstrate that leveraging such data can express meaningful outcomes of collocated social interaction. Since a known outcome of collocating with peers is improved performance, we inspected if automatically–inferred collocation behaviors can indicate the individual performance of project group members on a campus. We studied 163 students (in 54 project groups) over 14 weeks. After describing how we determine collocation with the WiFi logs, we present a study to analyze how collocation within groups relates to a student’s final score. We found that modeling collocation behaviors showed a significant correlation (Pearson’s$r =0.24$ r = 0.24 ) with performance (better than models of peer feedback or individual behaviors). These findings emphasize that it is feasible and valuable to characterize collocated social interactions with archived WiFi network logs. We conclude the paper with a discussion of applications for repurposing WiFi logs to describe collocation, along with privacy considerations, and directions for future work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.