Abstract:Despite the increasing awareness of the importance of sleep, the number of people suffering from insufficient sleep has increased every year. The gold-standard sleep assessment uses polysomnography (PSG) with various sensors to identify sleep patterns and disorders. However, due to the high cost of PSG and limited availability, many people with sleep disorders are left undiagnosed. Recent wearable sensors and electronics enable portable, continuous monitoring of sleep at home, overcoming the limitations of PSG… Show more
“…The results of these experiments clearly show that, compared to using just raw signal, converting the signal to multi-taper spectrogram as the input data provides not only comparable or higher classification performance within the public dataset but also superior transferability of the trained model for the classification of another dataset. The average inter-scorer agreement on standard PSG data was usually reported between 82% and 89% [3]. In agreement with this reported value, the average agreement between the two expert scorers of the ISRUC dataset was calculated to be 82.00%, with a Cohen's kappa value of 0.766.…”
Section: Performance Comparison With Other Workmentioning
confidence: 55%
“…To evaluate the performance of the sleep stage classification, there are multiple performance metrics being used in the field, including sensitivity, specificity, and F-measure. Among these metrics, the accuracy rate and Cohen's kappa coefficient are the most commonly used metrics [3], so these metrics are presented and used for comparison in this table. Most of the existing works focused on analyzing public sleep datasets, except for a few cases.…”
Section: Performance Comparison With Other Workmentioning
confidence: 99%
“…An accurate sleep stage classification [1][2][3] Compared to the manual scoring of sleep stages, automatic sleep stage classification serves as a more efficient way to evaluate a large amount of sleep data. Machine learning algorithms have been adopted in automatic sleep stage classification to increase classification efficiency and performance in recent years.…”
Sleep stage classification is an essential process of diagnosing sleep disorders and related diseases. Automatic sleep stage classification using machine learning has been widely studied due to its higher efficiency compared with manual scoring. Typically, a few polysomnography data are selected as input signals, and human experts label the corresponding sleep stages manually. However, the manual process includes human error and inconsistency in the scoring and stage classification. Here, we present a convolutional neural network (CNN)-based classification method that offers highly accurate, automatic sleep stage detection, validated by a public dataset and new data measured by wearable nanomembrane dry electrodes. First, our study makes a training and validation model using a public dataset with two brain signal and two eye signal channels. Then, we validate this model with a new dataset measured by a set of nanomembrane electrodes. The result of the automatic sleep stage classification shows that our CNN model with multi-taper spectrogram pre-processing achieved 88.85% training accuracy on the validation dataset and 81.52% prediction accuracy on our laboratory dataset. These results validate the reliability of our classification method on the standard polysomnography dataset and the transferability of our CNN model for other datasets measured with the wearable electrodes.
“…The results of these experiments clearly show that, compared to using just raw signal, converting the signal to multi-taper spectrogram as the input data provides not only comparable or higher classification performance within the public dataset but also superior transferability of the trained model for the classification of another dataset. The average inter-scorer agreement on standard PSG data was usually reported between 82% and 89% [3]. In agreement with this reported value, the average agreement between the two expert scorers of the ISRUC dataset was calculated to be 82.00%, with a Cohen's kappa value of 0.766.…”
Section: Performance Comparison With Other Workmentioning
confidence: 55%
“…To evaluate the performance of the sleep stage classification, there are multiple performance metrics being used in the field, including sensitivity, specificity, and F-measure. Among these metrics, the accuracy rate and Cohen's kappa coefficient are the most commonly used metrics [3], so these metrics are presented and used for comparison in this table. Most of the existing works focused on analyzing public sleep datasets, except for a few cases.…”
Section: Performance Comparison With Other Workmentioning
confidence: 99%
“…An accurate sleep stage classification [1][2][3] Compared to the manual scoring of sleep stages, automatic sleep stage classification serves as a more efficient way to evaluate a large amount of sleep data. Machine learning algorithms have been adopted in automatic sleep stage classification to increase classification efficiency and performance in recent years.…”
Sleep stage classification is an essential process of diagnosing sleep disorders and related diseases. Automatic sleep stage classification using machine learning has been widely studied due to its higher efficiency compared with manual scoring. Typically, a few polysomnography data are selected as input signals, and human experts label the corresponding sleep stages manually. However, the manual process includes human error and inconsistency in the scoring and stage classification. Here, we present a convolutional neural network (CNN)-based classification method that offers highly accurate, automatic sleep stage detection, validated by a public dataset and new data measured by wearable nanomembrane dry electrodes. First, our study makes a training and validation model using a public dataset with two brain signal and two eye signal channels. Then, we validate this model with a new dataset measured by a set of nanomembrane electrodes. The result of the automatic sleep stage classification shows that our CNN model with multi-taper spectrogram pre-processing achieved 88.85% training accuracy on the validation dataset and 81.52% prediction accuracy on our laboratory dataset. These results validate the reliability of our classification method on the standard polysomnography dataset and the transferability of our CNN model for other datasets measured with the wearable electrodes.
“…There is a vast body of literature on sleep tracking solutions (surveyed in [41,56,63,67,84]). Our core contribution is the development and evaluation of an unobtrusive all-textile sleep monitoring solution that can capture all signals of interest to sleep.…”
Section: State-of-the-art In Sleep Trackingmentioning
Clinical-grade wearable sleep monitoring is a challenging problem since it requires concurrently monitoring brain activity, eye movement, muscle activity, cardio-respiratory features, and gross body movements. This requires multiple sensors to be worn at different locations as well as uncomfortable adhesives and discrete electronic components to be placed on the head. As a result, existing wearables either compromise comfort or compromise accuracy in tracking sleep variables. We propose PhyMask, an all-textile sleep monitoring solution that is practical and comfortable for continuous use and that acquires all signals of interest to sleep solely using comfortable textile sensors placed on the head. We show that PhyMask can be used to accurately measure all the signals required for precise sleep stage tracking and to extract advanced sleep markers such as spindles and K-complexes robustly in the real-world setting. We validate PhyMask against polysomnography and show that it significantly outperforms two commercially-available sleep tracking wearables – Fitbit and Oura Ring.
“… 25 Earlier consumer wearables had relatively poor and variable performance, but newer devices have shown greater accuracy and consistency of sleep assessment which continue to improve. 2 , 5 , 26 , 27 To guide their own usage of collected data, most users, clinicians, and researchers, are interested in improvements of two major areas, the detection of sleep and wake, from which bedtime, waketime, wake after sleep onset, total sleep time and sleep efficiency are derived, and secondly, the accuracy and consistency of sleep staging, particularly of slow wave sleep and REM. Researchers intending to use consumer wearables are additionally concerned about whether their collected data can benefit from continued refinements to data collection or processing that may help “future-proof” legacy data.…”
Purpose
To evaluate the benefits of applying an improved sleep detection and staging algorithm on minimally processed multi-sensor wearable data collected from older generation hardware.
Patients and Methods
58 healthy, East Asian adults aged 23–69 years (M = 37.10, SD = 13.03, 32 males), each underwent 3 nights of PSG at home, wearing 2
nd
Generation Oura Rings equipped with additional memory to store raw data from accelerometer, infra-red photoplethysmography and temperature sensors. 2-stage and 4-stage sleep classifications using a new machine-learning algorithm (Gen3) trained on a diverse and independent dataset were compared to the existing consumer algorithm (Gen2) for whole-night and epoch-by-epoch metrics.
Results
Gen 3 outperformed its predecessor with a mean (SD) accuracy of 92.6% (0.04), sensitivity of 94.9% (0.03), and specificity of 78.5% (0.11); corresponding to a 3%, 2.8% and 6.2% improvement from Gen2 across the three nights, with Cohen’s d values >0.39, t values >2.69, and p values <0.01. Notably, Gen 3 showed robust performance comparable to PSG in its assessment of sleep latency, light sleep, rapid eye movement (REM), and wake after sleep onset (WASO) duration. Participants <40 years of age benefited more from the upgrade with less measurement bias for total sleep time (TST), WASO, light sleep and sleep efficiency compared to those ≥40 years. Males showed greater improvements on TST and REM sleep measurement bias compared to females, while females benefitted more for deep sleep measures compared to males.
Conclusion
These results affirm the benefits of applying machine learning and a diverse training dataset to improve sleep measurement of a consumer wearable device. Importantly, collecting raw data with appropriate hardware allows for future advancements in algorithm development or sleep physiology to be retrospectively applied to enhance the value of longitudinal sleep studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.