2023
DOI: 10.1101/2023.12.08.23299718
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reengineering a machine learning phenotype to adapt to the changing COVID-19 landscape: A study from the N3C and RECOVER consortia

Miles Crosskey,
Tomas McIntee,
Sandy Preiss
et al.

Abstract: BackgroundIn 2021, we used the National COVID Cohort Collaborative (N3C) as part of the NIH RECOVER Initiative to develop a machine learning (ML) pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection (PASC), or Long COVID. However, the increased home testing, missing documentation, and reinfections that characterize the latter years of the pandemic necessitate reengineering our original model to account for these changes in the COVID-19 research landscape.M… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 23 publications
(24 reference statements)
0
5
0
Order By: Relevance
“…The PASC computable phenotype may also misclassify patients. 29 For this reason, the confidence intervals around computable phenotype-based incidence estimates are likely too narrow. Vaccination status is poorly documented in most EHRs, which precluded its use as a covariate.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…The PASC computable phenotype may also misclassify patients. 29 For this reason, the confidence intervals around computable phenotype-based incidence estimates are likely too narrow. Vaccination status is poorly documented in most EHRs, which precluded its use as a covariate.…”
Section: Discussionmentioning
confidence: 99%
“…The model considers many more diagnosis codes than those included in the symptom clusters (see the “SNOMED Roll Up” section in the supplement of Crosskey et al, 2023), and a positive prediction may be based on other diagnosis codes. 11 Also, the computable phenotype model does not include a novelty restriction. For example, if a patient had a dyspnea diagnosis in the three years prior to index, a post-acute dyspnea diagnosis would not count for the respiratory symptom cluster, but it would be considered by the computable phenotype model.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…N3C’s definition identified a) earliest documentation of a U09.9 / B94.8 diagnosis code or b) patients with predicted PASC by a machine-learning based algorithm trained on patients with a U09.9 diagnosis. 25 PCORnet and PEDSnet applied rules-based definitions based on a combination of clinical input and data-driven analysis of new-onset diagnoses more common in patients with COVID-19 than without. 26 28 Generally, PCORnet classified patients based on the presence of a U09.9 or B94.8 code, or at least one incident PASC diagnosis.…”
Section: Methodsmentioning
confidence: 99%