SUMMARY Interrater variability of sleep stage scorings has an essential impact not only on the reading of polysomnographic sleep studies (PSGs) for clinical trials but also on the evaluation of patientsÕ sleep. With the introduction of a new standard for sleep stage scorings (AASM standard) there is a need for studies on interrater reliability (IRR). The SIESTA database resulting from an EU-funded project provides a large number of studies (n = 72; 56 healthy controls and 16 subjects with different sleep disorders, mean age ± SD: 57.7 ± 18.7, 34 females) for which scorings according to both standards (AASM and R&K) were done. Differences in IRR were analysed at two levels: (1) based on quantitative sleep parameter by means of intraclass correlations; and (2) based on an epoch-by-epoch comparison by means of CohenÕs kappa and FleissÕ kappa. The overall agreement was for the AASM standard 82.0% (CohenÕs kappa = 0.76) and for the R&K standard 80.6% (CohenÕs kappa = 0.68). Agreements increased from R&K to AASM for all sleep stages, except N2. The results of this study underline that the modification of the scoring rules improve IRR as a result of the integration of occipital, central and frontal leads on the one hand, but decline IRR on the other hand specifically for N2, due to the new rule that cortical arousals with or without concurrent increase in submental electromyogram are critical events for the end of N2.k e y w o r d s AASM scoring standard, interrater reliability, Rechtschaffen and Kales, SIESTA project, sleep stage scoring
To date, the only standard for the classification of sleep-EEG recordings that has found worldwide acceptance are the rules published in 1968 by Rechtschaffen and Kales. Even though several attempts have been made to automate the classification process, so far no method has been published that has proven its validity in a study including a sufficiently large number of controls and patients of all adult age ranges. The present paper describes the development and optimization of an automatic classification system that is based on one central EEG channel, two EOG channels and one chin EMG channel. It adheres to the decision rules for visual scoring as closely as possible and includes a structured quality control procedure by a human expert. The final system (Somnolyzer 24 × 7™) consists of a raw data quality check, a feature extraction algorithm (density and intensity of sleep/wake-related patterns such as sleep spindles, delta waves, SEMs and REMs), a feature matrix plausibility check, a classifier designed as an expert system, a rule-based smoothing procedure for the start and the end of stages REM, and finally a statistical comparison to age- and sex-matched normal healthy controls (Siesta Spot Report™). The expert system considers different prior probabilities of stage changes depending on the preceding sleep stage, the occurrence of a movement arousal and the position of the epoch within the NREM/REM sleep cycles. Moreover, results obtained with and without using the chin EMG signal are combined. The Siesta polysomnographic database (590 recordings in both normal healthy subjects aged 20–95 years and patients suffering from organic or nonorganic sleep disorders) was split into two halves, which were randomly assigned to a training and a validation set, respectively. The final validation revealed an overall epoch-by-epoch agreement of 80% (Cohen’s kappa: 0.72) between the Somnolyzer 24 × 7 and the human expert scoring, as compared with an inter-rater reliability of 77% (Cohen’s kappa: 0.68) between two human experts scoring the same dataset. Two Somnolyzer 24 × 7 analyses (including a structured quality control by two human experts) revealed an inter-rater reliability close to 1 (Cohen’s kappa: 0.991), which confirmed that the variability induced by the quality control procedure, whereby approximately 1% of the epochs (in 9.5% of the recordings) are changed, can definitely be neglected. Thus, the validation study proved the high reliability and validity of the Somnolyzer 24 × 7 and demonstrated its applicability in clinical routine and sleep studies.
The study shows significant and age-dependent differences between sleep parameters derived from conventional visual sleep scorings on the basis of R&K rules and those based on the new AASM rules. Thus, new normative data have to be established for the AASM standard.
SUMMAR Y Interrater variability of sleep stage scorings is a well-known phenomenon. The SIESTA project offered the opportunity to analyse interrater reliability (IRR) between experienced scorers from eight European sleep laboratories within a large sample of patients with different (sleep) disorders: depression, general anxiety disorder with and without non-organic insomnia, Parkinson's disease, period limb movements in sleep and sleep apnoea. The results were based on 196 recordings from 98 patients (73 males: 52.3 ± 12.1 years and 25 females: 49.5 ± 11.9 years) for which two independent expert scorings from two different laboratories were available. Cohen's j was used to evaluate the IRR on the basis of epochs and intraclass correlation was used to analyse the agreement on quantitative sleep parameters. The overall level of agreement when five different stages were distinguished was j ¼ 0.6816 (76.8%), which in terms of j reflects a 'substantial' agreement (Landis and Koch, 1977). For different groups of patients j values varied from 0.6138 (Parkinson's disease) to 0.8176 (generalized anxiety disorder). With regard to (sleep) stages, the IRR was highest for rapid eye movement (REM), followed by Wake, slow-wave sleep (SWS), non-rapid eye movement 2 (NREM2) and NREM1. The results of regression analysis showed that age and sex only had a statistically significant effect on j when the (sleep) stages are considered separately. For NREM2 and SWS a statistically significant decrease of IRR with age has been observed and the IRR for SWS was lower for males than for females. These variations of IRR most probably reflect changes of the sleep electroencephalography (EEG) with age and gender.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.