Hyeokhyen Kwon scite author profile

Supervised training of human activity recognition (HAR) systems based on body-worn inertial measurement units (IMUs) is often constrained by the typically rather small amounts of labeled sample data. Systems like IMUTube have been introduced that employ cross-modality transfer approaches to convert videos of activities of interest into virtual IMU data. We demonstrate for the first time how such large-scale virtual IMU datasets can be used to train HAR systems that are substantially more complex than the state-of-the-art. Complexity is thereby represented by the number of model parameters that can be trained robustly. Our models contain components that are dedicated to capture the essentials of IMU data as they are of relevance for activity recognition, which increased the number of trainable parameters by a factor of 1100 compared to state-of-the-art model architectures. We evaluate the new model architecture on the challenging task of analyzing free-weight gym exercises, specifically on classifying 13 dumbbell execises. We have collected around 41 h of virtual IMU data using IMUTube from exercise videos available from YouTube. The proposed model is trained with the large amount of virtual IMU data and calibrated with a mere 36 min of real IMU data. The trained model was evaluated on a real IMU dataset and we demonstrate the substantial performance improvements of 20% absolute F1 score compared to the state-of-the-art convolutional models in HAR.

show abstract

Adding structural characteristics to distribution-based accelerometer representations for activity recognition using wearables

Kwon

Abowd

Plötz

2018

View full text Add to dashboard Cite

IMUTube

Kwon

Tong

Haresamudram

et al. 2020

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.

View full text Add to dashboard Cite

The lack of large-scale, labeled data sets impedes progress in developing robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets. Our initial results are very promising, but the greater promise of this work lies in a collective approach by the computer vision, signal processing, and activity recognition communities to extend this work in ways that we outline. This should lead to on-body, sensor-based HAR becoming yet another success story in large-dataset breakthroughs in recognition.

show abstract

Handling annotation uncertainty in human activity recognition

Kwon

Abowd

Plötz

2019

View full text Add to dashboard Cite

An Explainable Spatial-Temporal Graphical Convolutional Network to Score Freezing of Gait in Parkinsonian Patients

Kwon

Clifford

Genias

et al. 2023

Sensors

View full text Add to dashboard Cite

Freezing of gait (FOG) is a poorly understood heterogeneous gait disorder seen in patients with parkinsonism which contributes to significant morbidity and social isolation. FOG is currently measured with scales that are typically performed by movement disorders specialists (i.e., MDS-UPDRS), or through patient completed questionnaires (N-FOG-Q) both of which are inadequate in addressing the heterogeneous nature of the disorder and are unsuitable for use in clinical trials The purpose of this study was to devise a method to measure FOG objectively, hence improving our ability to identify it and accurately evaluate new therapies. A major innovation of our study is that it is the first study of its kind that uses the largest sample size (>30 h, N = 57) in order to apply explainable, multi-task deep learning models for quantifying FOG over the course of the medication cycle and at varying levels of parkinsonism severity. We trained interpretable deep learning models with multi-task learning to simultaneously score FOG (cross-validated F1 score 97.6%), identify medication state (OFF vs. ON levodopa; cross-validated F1 score 96.8%), and measure total PD severity (MDS-UPDRS-III score prediction error ≤ 2.7 points) using kinematic data of a well-characterized sample of N = 57 patients during levodopa challenge tests. The proposed model was able to explain how kinematic movements are associated with each FOG severity level that were highly consistent with the features, in which movement disorders specialists are trained to identify as characteristics of freezing. Overall, we demonstrate that deep learning models’ capability to capture complex movement patterns in kinematic data can automatically and objectively score FOG with high accuracy. These models have the potential to discover novel kinematic biomarkers for FOG that can be used for hypothesis generation and potentially as clinical trial outcome measures.

show abstract

IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition

Kwon¹,

Tong²,

Haresamudram³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

Leveraging WiFi network logs to infer student collocation and its relationship with academic performance

et al. 2023

View full text Add to dashboard Cite

A comprehensive understanding of collocated social interactions can help campuses and organizations better support their community. Universities could determine new ways to conduct classes and design programs by studying how students have collocated in the past. However, this needs data that describe large groups over a long period. Harnessing user devices to infer collocation, while tempting, is challenged by privacy concerns, power consumption, and maintenance issues. Alternatively, embedding new sensors across the entire campus is expensive. Instead, we investigate an easily accessible data source that can retroactively depict multiple users on campus over a semester, a managed WiFi network. Despite the coarse approximations of collocation provided by WiFi network logs, we demonstrate that leveraging such data can express meaningful outcomes of collocated social interaction. Since a known outcome of collocating with peers is improved performance, we inspected if automatically–inferred collocation behaviors can indicate the individual performance of project group members on a campus. We studied 163 students (in 54 project groups) over 14 weeks. After describing how we determine collocation with the WiFi logs, we present a study to analyze how collocation within groups relates to a student’s final score. We found that modeling collocation behaviors showed a significant correlation (Pearson’s$r =0.24$ r = 0.24 ) with performance (better than models of peer feedback or individual behaviors). These findings emphasize that it is feasible and valuable to characterize collocated social interactions with archived WiFi network logs. We conclude the paper with a discussion of applications for repurposing WiFi logs to describe collocation, along with privacy considerations, and directions for future work.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hyeokhyen Kwon

RGB-Guided Hyperspectral Image Upsampling

Complex Deep Neural Networks from Large Scale Virtual IMU Data for Effective Human Activity Recognition Using Wearables

Adding structural characteristics to distribution-based accelerometer representations for activity recognition using wearables

IMUTube

Handling annotation uncertainty in human activity recognition

An Explainable Spatial-Temporal Graphical Convolutional Network to Score Freezing of Gait in Parkinsonian Patients

IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition

Leveraging WiFi network logs to infer student collocation and its relationship with academic performance

Contact Info

Product

Resources

About