2022
DOI: 10.1038/s41746-022-00570-4
|View full text |Cite
|
Sign up to set email alerts
|

A multicenter evaluation of computable phenotyping approaches for SARS-CoV-2 infection and COVID-19 hospitalizations

Abstract: Diagnosis codes are used to study SARS-CoV2 infections and COVID-19 hospitalizations in administrative and electronic health record (EHR) data. Using EHR data (April 2020–March 2021) at the Yale-New Haven Health System and the three hospital systems of the Mayo Clinic, computable phenotype definitions based on ICD-10 diagnosis of COVID-19 (U07.1) were evaluated against positive SARS-CoV-2 PCR or antigen tests. We included 69,423 patients at Yale and 75,748 at Mayo Clinic with either a diagnosis code or a posit… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 35 publications
(24 reference statements)
0
4
0
Order By: Relevance
“… 13 For defining a phenotype, one study found that standard tools like positive lab tests and ICD10 codes had limited precision and recall. 14 Our work is complementary to that work in showing the value of NLP to improve sensitivity and find additional cases. Other computable phenotyping work is in the context of post-acute sequelae SARS-CoV-2 infection (PASC), 15 and therefore must address the heterogeneity and uncertainty of the condition.…”
Section: Discussionmentioning
confidence: 69%
“… 13 For defining a phenotype, one study found that standard tools like positive lab tests and ICD10 codes had limited precision and recall. 14 Our work is complementary to that work in showing the value of NLP to improve sensitivity and find additional cases. Other computable phenotyping work is in the context of post-acute sequelae SARS-CoV-2 infection (PASC), 15 and therefore must address the heterogeneity and uncertainty of the condition.…”
Section: Discussionmentioning
confidence: 69%
“…This finding contrasts with early assessments of COVID-19 ICD-10 codes which reported excellent positive and negative predictive values for ICD-10 codes compared to PCR data in all patients and critically ill patients, respectively. 8,24,[31][32][33][34] In retrospect, the excellent performance for ICD-10 codes in these studies was likely due to the use of PCR positivity as the gold standard for COVID-19 hospitalization, as well as the newness of the epidemic, focal use of testing, low healthcare utilization for non-COVID-19 care (hence fewer incidental cases), and fewer false-positive results due to prior infections. We advise caution when interpreting studies which identify COVID-19 hospitalizations using ICD-10 codes during the current era.…”
Section: Discussionmentioning
confidence: 90%
“…[3][4][5] Large cohort studies have also used different approaches for defining COVID-19 hospitalizations, including a positive PCR alone, [6][7][8][9] International Classification of Disease, Tenth Revision, Clinical Modification (ICD-10-CM) codes for COVID-19, [10][11][12][13][14][15][16][17][18] institutional definitions, or combinations of these. [19][20][21][22][23][24][25] Notwithstanding the panoply of definitions being used, few data are available that compare estimates of COVID-19 hospitalizations, severity of illness, mortality, and trends between definitions, nor their accuracy in identifying primary or contributing versus incidental infections.…”
mentioning
confidence: 99%
“…Interestingly, phenotypes that relied on diagnosis code data performed less robustly. Previous studies have demonstrated the underreporting of conditions when relying on diagnostic codes alone [22][23][24]. Accordingly, it is possible that diagnostic codes themselves are not sensitive enough for identification of hypertension.…”
Section: Principal Findingsmentioning
confidence: 99%