Objective: To determine whether information in medical and pharmacy claims data can predict, at the time of prescribing the first antiepileptic drug (AED), which patients with epilepsy will become resistant to AEDs. Method: We analyzed longitudinal claims data from 1,376,756 patients with epilepsy from 2006 to 2015. Of these, 582,258 satisfied all inclusion criteria; 49,916 were ultimately AED resistant, operationally defined as a patient with claims filed for at least 4 distinct AEDs. We constructed 1,270 candidate predictors (“features”) reflecting demographics, comorbidities, medications, procedures, epilepsy status, and payer status to characterize the cohort. On the training dataset (528,640 patients) we performed ANOVA F-value tests to select predictive features and trained several prediction algorithms, including logistic regression, support vector machines (SVM), and random forests. A model with only age and gender was used as a benchmark model. Results: On a held-out test set (53,618 patients), the best model achieves an area under the receiver operating characteristic (ROC) curve (AUC) [95% CI] of 0.753 [0.747, 0.759], compared to 0.664 [0.658, 0.671] for the benchmark model. Moreover, predicted probabilities for drug resistance closely match the observed frequencies. Compared to waiting for 2 AED failures, our model predicts drug resistance on average 2.25 years earlier. Conclusion: Predictive models created from large claims data using machine learning methods can accurately predict which patients with epilepsy will prove drug resistant at the time of prescribing the first AED. The ability to predict refractoriness may help patients consider alternative therapies earlier in the course of their epilepsy.
This study was undertaken to determine the dose-response relation between epileptiform activity burden and outcomes in acutely ill patients. Methods: A single center retrospective analysis was made of 1,967 neurologic, medical, and surgical patients who underwent >16 hours of continuous electroencephalography (EEG) between 2011 and 2017. We developed an artificial intelligence algorithm to annotate 11.02 terabytes of EEG and quantify epileptiform activity burden within 72 hours of recording. We evaluated burden (1) in the first 24 hours of recording, (2) in the 12-hours epoch with highest burden (peak burden), and (3) cumulatively through the first 72 hours of monitoring. Machine learning was applied to estimate the effect of epileptiform burden on outcome. Outcome measure was discharge modified Rankin Scale, dichotomized as good (0-4) versus poor (5-6). Results: Peak epileptiform burden was independently associated with poor outcomes (p < 0.0001). Other independent associations included age, Acute Physiology and Chronic Health Evaluation II score, seizure on presentation, and diagnosis of hypoxic-ischemic encephalopathy. Model calibration error was calculated across 3 strata based on the time interval between last EEG measurement (up to 72 hours of monitoring) and discharge: (1) <5 days between last measurement and discharge, 0.0941 (95% confidence interval [CI] = 0.0706-0.1191); 5 to 10 days between last measurement and discharge, 0.0946 (95% CI = 0.0631-0.1290); >10 days between last measurement and discharge, 0.0998 (95% CI = 0.0698-0.1335). After adjusting for covariates, increase in peak epileptiform activity burden from 0 to 100% increased the probability of poor outcome by 35%. Interpretation: Automated measurement of peak epileptiform activity burden affords a convenient, consistent, and quantifiable target for future multicenter randomized trials investigating whether suppressing epileptiform activity improves outcomes.
Background and Objectives:The validity of brain monitoring using electroencephalography (EEG), particularly to guide care in patients with acute or critical illness, requires that experts can reliably identify seizures and other potentially harmful rhythmic and periodic brain activity, collectively referred to as “ictal-interictal-injury continuum" (IIIC). Prior inter-rater reliability (IRR) studies are limited by small samples and selection bias. This study was conducted to assess the reliability of experts in identifying IIIC.Methods:This prospective analysis included 30 experts with subspecialty clinical neurophysiology training from 18 institutions. Experts independently scored varying numbers of ten-second EEG segments as: “Seizure (SZ)”, “Lateralized Periodic Discharges (LPD)”, “Generalized Periodic Discharges (GPD)”, “Lateralized Rhythmic Delta Activity (LRDA)”, “Generalized Rhythmic Delta Activity (GRDA)”, or “Other”. EEGs were performed for clinical indications at Massachusetts General Hospital between 2006 to 2020. Primary outcome measures were pairwise IRR (average percent agreement (PA) between pairs of experts) and majority IRR (average PA with group consensus) for each class; and beyond chance agreement (κ). Secondary outcomes were calibration of expert scoring to group consensus, and latent trait analysis to investigate contributions of bias and noise to scoring variability.Results:Among 2,711 EEGs, 49% were from females, and median (IQR) age was 55 (41). In total experts scored 50,697 EEG segments; the median [range] number scored by each expert was 6,287.5 [1,002, 45,267]. Overall pairwise IRR was moderate (PA 52%, κ 42%), and majority IRR was substantial (PA 65%, κ 61%). Noise-bias analysis demonstrated that a single underlying receiver operating curve can account for most variation in experts' false positive vs true positive characteristics (median [range] of variance explained (R2): 95 [93, 98]%), and for most variation in experts’ precision vs sensitivity characteristics (R2: 75 [59, 89]%). Thus, variation between experts is mostly attributable not to differences in expertise, but rather to variation in decision thresholds.Discussion:Our results provide precise estimates of expert reliability from a large and diverse sample, and a parsimonious theory to explain the origin of disagreements between experts. The results also establish a standard for how well an automated IIIC classifier must perform to match experts.Classification of Evidence:This study provides Class II evidence that independent expert review reliably identifies ictal-interictal injury continuum patterns on EEG compared to expert consensus.
Background and Objectives:Seizures and other seizure-like patterns of brain activity can harm the brain and contribute to in-hospital death, particularly when prolonged. However, experts qualified to interpret electroencephalography (EEG) data are scarce. Prior attempts to automate this task have been limited by small or inadequately labeled samples and have not convincingly demonstrated generalizable expert-level performance. There exists a critical unmet need for an automated method to classify seizures and other seizure-like events with expert-level reliability. This study was conducted to develop and validate a computer algorithm that matches the reliability and accuracy of experts in identifying seizures and seizure-like events, known as “ictal-interictal-injury-continuum” (IIIC) patterns on EEG, including seizures (SZ), lateralized and generalized periodic discharges (LPD, GPD), and lateralized and generalized rhythmic delta activity (LRDA, GRDA), and in differentiating these patterns from non-IIIC patterns.Methods:We used 6,095 scalp EEGs from 2,711 patients with and without IIIC events to train a deep neural network,SPaRCNet, to perform IIIC event classification. Independent training and test datasets were generated from 50,697 EEG segments, independently annotated by 20 fellowship-trained neurophysiologists. We assessed whetherSPaRCNetperforms at or above the sensitivity, specificity, precision, and calibration of fellowship-trained neurophysiologists for identifying IIIC events. Statistical performance was assessed via the calibration index, and by the percentage of experts whose operating points were below the model’s receiver operating characteristic curves (ROC) and precision recall curves (PRC) for the 6 pattern classes.Results:SPaRCNetmatches or exceeds most experts in classifying IIIC events based on both calibration and discrimination metrics. For SZ, LPD, GPD, LRDA, GRDA, and “Other” classes,SPaRCNetexceeds the following percentages of 20 experts – ROC: 45%, 20%, 50%, 75%, 55%, 40%; PRC: 50%, 35%, 50%, 90%, 70%, 45%; and calibration: 95%, 100%, 95%, 100%, 100%, 80%, respectively.Discussion:SPaRCNetis the first algorithm to match expert performance in detecting seizures and other seizure-like events in a representative sample of EEGs. With further development,SPaRCNetmay thus be a valuable tool for expedited review of EEGs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.