Interrater reliability of sleep stage scoring: a meta-analysis

Lee, Yun Ji; Lee, Jae Yong; Cho, Jae Hoon; Choi, Ji Ho

doi:10.5664/jcsm.9538

Cited by 74 publications

(66 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When interpreting the performance of automated sleep staging algorithms it is important to keep in mind that manual scoring by humans is highly subjective [27]. Inter-rater agreement for PSG labeled by human scorers is reported as a κ of 0.76 (95% confidence interval, 0.71-0.81) [28] for 5-class sleep staging. Common mistakes between human scorers during PSG include confusion between wake and light sleep and light sleep and deep sleep [29].…”

Section: Discussionmentioning

confidence: 99%

SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography

Kotzen¹,

Charlton²,

Salabi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Introduction: Sleep staging is an essential component in the diagnosis of sleep disorders and management of sleep health. It is traditionally measured in a clinical setting and requires a labor-intensive labeling process. We hypothesize that it is possible to perform robust 4-class sleep staging using the raw photoplethysmography (PPG) time series and modern advances in deep learning (DL). Methods: We used two publicly available sleep databases that included raw PPG recordings, totalling 2,374 patients and 23,055 hours. We developed SleepPPG-Net, a DL model for 4-class sleep staging from the raw PPG time series. SleepPPG-Net was trained end-to-end and consists of a residual convolutional network for automatic feature extraction and a temporal convolutional network to capture long-range contextual information. We benchmarked the performance of SleepPPG-Net against models based on the best-reported state-of-the-art (SOTA) algorithms. Results: When benchmarked on a held-out test set, SleepPPG-Net obtained a median Cohen's Kappa (κ) score of 0.75 against 0.69 for the best SOTA approach. SleepPPG-Net showed good generalization performance to an external database, obtaining a κ score of 0.74 after transfer learning. Perspective Overall, SleepPPG-Net provides new SOTA performance. In addition, performance is high enough to open the path to the development of wearables that meet the requirements for usage in clinical applications such as the diagnosis and monitoring of obstructive sleep apnea.

show abstract

Section: Discussionmentioning

confidence: 99%

SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography

Kotzen¹,

Charlton²,

Salabi³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…If the agreement between sleep stage scorers is 85%, this is acceptable, and start to be worried if the agreement drops below 70%. As a side note, these numbers are a simplified estimate, because quantitative reliability studies use many different methods, like Cohen’s K , Fleiss K , Pearson product-moment correlation coefficient, and intraclass correlation coefficient (ICC) which reflect different statistical properties of differences [ 3 , 6 ]. Making use of algorithms and software approaches as used for cancer diagnosis (e.g.…”

Section: The Problemmentioning

confidence: 99%

“…Currently, rules for sleep staging are laid out in the AASM manual version 2.6 which provides definitions for sleep stages and the most commonly observed events related to the sleep disorders with the highest prevalence [ 1 ]. We recognize the high variability in sleep scoring results achieved by expert sleep scorers [ 2 , 3 ]. Visual sleep scoring is still, however, a very valid task because we may observe unexpected events during sleep and this teaches us much about the abnormalities observed during sleep.…”

Section: Introductionmentioning

confidence: 99%

Sleep scoring moving from visual scoring towards automated scoring

Penzel

2022

Sleep

View full text Add to dashboard Cite

“…Interestingly, while this is often cited as an area of concern when using wearable technology, it is rarely discussed as a concern when using statistical analysis software or other proprietary software systems commonly used in research that could influence the interpretation of results. Similarly, inter-rater reliability of PSG scoring remains an issue even in highly trained technicians [ 58 ], which begs the question of the intrinsic accuracy of any interpretation of physiological signal measurements.…”

Section: Trustmentioning

confidence: 99%

Technical, Regulatory, Economic, and Trust Issues Preventing Successful Integration of Sensors into the Mainstream Consumer Wearables Market

Devine

Schwartz

Hursh

2022

Sensors

View full text Add to dashboard Cite

Sensors that track physiological biomarkers of health must be successfully incorporated into a fieldable, wearable device if they are to revolutionize the management of remote patient care and preventative medicine. This perspective article discusses logistical considerations that may impede the process of adapting a body-worn laboratory sensor into a commercial-integrated health monitoring system with a focus on examples from sleep tracking technology.

show abstract

Interrater reliability of sleep stage scoring: a meta-analysis

Cited by 74 publications

References 41 publications

SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography

SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography

Sleep scoring moving from visual scoring towards automated scoring

Technical, Regulatory, Economic, and Trust Issues Preventing Successful Integration of Sensors into the Mainstream Consumer Wearables Market

Contact Info

Product

Resources

About