Objective COVID-19 poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. Methods The Clinical and Translational Science Award (CTSA) Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. Organized in inclusive workstreams, in two months we created: legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. Discussion The N3C has demonstrated that a multi-site collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multi-organizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19. LAY SUMMARY COVID-19 poses societal challenges that require expeditious data and knowledge sharing. Though medical records are abundant, they are largely inaccessible to outside researchers. Statistical, machine learning, and causal research are most successful with large datasets beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many clinical centers to reveal patterns in COVID-19 patients. To create N3C, the community had to overcome technical, regulatory, policy, and governance barriers to sharing patient-level clinical data. In less than 2 months, we developed solutions to acquire and harmonize data across organizations and created a secure data environment to enable transparent and reproducible collaborative research. We expect the N3C to help save lives by enabling collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care needs and thereby reduce the immediate and long-term impacts of COVID-19.
IMPORTANCEThe National COVID Cohort Collaborative (N3C) is a centralized, harmonized, highgranularity electronic health record repository that is the largest, most representative COVID-19 cohort to date. This multicenter data set can support robust evidence-based development of predictive and diagnostic tools and inform clinical care and policy.OBJECTIVES To evaluate COVID-19 severity and risk factors over time and assess the use of machine learning to predict clinical severity. DESIGN, SETTING, AND PARTICIPANTSIn a retrospective cohort study of 1 926 526 US adults with SARS-CoV-2 infection (polymerase chain reaction >99% or antigen <1%) and adult patients without SARS-CoV-2 infection who served as controls from 34 medical centers nationwide between January 1, 2020, and December 7, 2020, patients were stratified using a World Health Organization COVID-19 severity scale and demographic characteristics. Differences between groups over time were evaluated using multivariable logistic regression. Random forest and XGBoost models were used to predict severe clinical course (death, discharge to hospice, invasive ventilatory support, or extracorporeal membrane oxygenation). MAIN OUTCOMES AND MEASURESPatient demographic characteristics and COVID-19 severity using the World Health Organization COVID-19 severity scale and differences between groups over time using multivariable logistic regression. RESULTSThe cohort included 174 568 adults who tested positive for SARS-CoV-2 (mean [SD] age, 44.4 [18.6] years; 53.2% female) and 1 133 848 adult controls who tested negative for SARS-CoV-2 (mean [SD] age, 49.5 [19.2] years; 57.1% female). Of the 174 568 adults with SARS-CoV-2, 32 472(18.6%) were hospitalized, and 6565 (20.2%) of those had a severe clinical course (invasive ventilatory support, extracorporeal membrane oxygenation, death, or discharge to hospice). Of the hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March to April 2020 to 8.6% in September to October 2020 (P = .002 for monthly trend). Using 64 inputs available on the first hospital day, this study predicted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across machine learning methods. In a separate multivariable logistic regression model built for inference, (continued) Key Points Question In a US data resource large enough to adjust for multiple confounders, what risk factors are associated with COVID-19 severity and severity trajectory over time, and can machine learning models predict clinical severity? Findings In this cohort study of 174 568 adults with SARS-CoV-2, 32 472 (18.6%) were hospitalized and 6565 (20.2%) were severely ill, and first-day machine learning models accurately predicted clinical severity. Mortality was 11.6%
Summary Background Hydroxychloroquine, a drug commonly used in the treatment of rheumatoid arthritis, has received much negative publicity for adverse events associated with its authorisation for emergency use to treat patients with COVID-19 pneumonia. We studied the safety of hydroxychloroquine, alone and in combination with azithromycin, to determine the risk associated with its use in routine care in patients with rheumatoid arthritis. Methods In this multinational, retrospective study, new user cohort studies in patients with rheumatoid arthritis aged 18 years or older and initiating hydroxychloroquine were compared with those initiating sulfasalazine and followed up over 30 days, with 16 severe adverse events studied. Self-controlled case series were done to further establish safety in wider populations, and included all users of hydroxychloroquine regardless of rheumatoid arthritis status or indication. Separately, severe adverse events associated with hydroxychloroquine plus azithromycin (compared with hydroxychloroquine plus amoxicillin) were studied. Data comprised 14 sources of claims data or electronic medical records from Germany, Japan, the Netherlands, Spain, the UK, and the USA. Propensity score stratification and calibration using negative control outcomes were used to address confounding. Cox models were fitted to estimate calibrated hazard ratios (HRs) according to drug use. Estimates were pooled where the I 2 value was less than 0·4. Findings The study included 956 374 users of hydroxychloroquine, 310 350 users of sulfasalazine, 323 122 users of hydroxychloroquine plus azithromycin, and 351 956 users of hydroxychloroquine plus amoxicillin. No excess risk of severe adverse events was identified when 30-day hydroxychloroquine and sulfasalazine use were compared. Self-controlled case series confirmed these findings. However, long-term use of hydroxychloroquine appeared to be associated with increased cardiovascular mortality (calibrated HR 1·65 [95% CI 1·12–2·44]). Addition of azithromycin appeared to be associated with an increased risk of 30-day cardiovascular mortality (calibrated HR 2·19 [95% CI 1·22–3·95]), chest pain or angina (1·15 [1·05–1·26]), and heart failure (1·22 [1·02–1·45]). Interpretation Hydroxychloroquine treatment appears to have no increased risk in the short term among patients with rheumatoid arthritis, but in the long term it appears to be associated with excess cardiovascular mortality. The addition of azithromycin increases the risk of heart failure and cardiovascular mortality even in the short term. We call for careful consideration of the benefit–risk trade-off when counselling those on hydroxychloroquine treatment. Funding National Institute for Health Research (NIHR) Oxford Biomedical Research Centre, NIHR Senior Research Fellowship programme, US National Institutes of Health, US Depar...
BackgroundThe majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.Methods and FindingsIn a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients.ConclusionsThis is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.
Comorbid conditions appear to be common among individuals hospitalised with coronavirus disease 2019 (COVID-19) but estimates of prevalence vary and little is known about the prior medication use of patients. Here, we describe the characteristics of adults hospitalised with COVID-19 and compare them with influenza patients. We include 34,128 (US: 8362, South Korea: 7341, Spain: 18,425) COVID-19 patients, summarising between 4811 and 11,643 unique aggregate characteristics. COVID-19 patients have been majority male in the US and Spain, but predominantly female in South Korea. Age profiles vary across data sources. Compared to 84,585 individuals hospitalised with influenza in 2014-19, COVID-19 patients have more typically been male, younger, and with fewer comorbidities and lower medication use. While protecting groups vulnerable to influenza is likely a useful starting point in the response to COVID-19, strategies will likely need to be broadened to reflect the particular characteristics of individuals being hospitalised with COVID-19.
Background Hydroxychloroquine has recently received Emergency Use Authorization by the FDA and is currently prescribed in combination with azithromycin for COVID-19 pneumonia. We studied the safety of hydroxychloroquine, alone and in combination with azithromycin.Methods New user cohort studies were conducted including 16 severe adverse events (SAEs).Rheumatoid arthritis patients aged 18+ and initiating hydroxychloroquine were compared to those initiating sulfasalazine and followed up over 30 days. Self-controlled case series (SCCS) were conducted to further establish safety in wider populations. Separately, SAEs associated with hydroxychloroquineazithromycin (compared to hydroxychloroquine-amoxicillin) were studied. Data comprised 14 sources of claims data or electronic medical records from Germany, Japan, Netherlands, Spain, UK, and USA.Propensity score stratification and calibration using negative control outcomes were used to address confounding. Cox models were fitted to estimate calibrated hazard ratios (CalHRs) according to drug use. Estimates were pooled where I2<40%. ResultsOverall, 956,374 and 310,350 users of hydroxychloroquine and sulfasalazine, and 323,122 and 351,956 users of hydroxychloroquine-azithromycin and hydroxychloroquine-amoxicillin were included.No excess risk of SAEs was identified when 30-day hydroxychloroquine and sulfasalazine use were compared. SCCS confirmed these findings. However, when azithromycin was added to hydroxychloroquine, we observed an increased risk of 30-day cardiovascular mortality (CalHR2.19 [1.22-3.94]), chest pain/angina (CalHR 1.15 [95% CI 1.05-1.26]), and heart failure (CalHR 1.22 [95% CI 1.02-1.45])
Background Angiotensin-converting enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs) have been postulated to affect susceptibility to COVID-19. Observational studies so far have lacked rigorous ascertainment adjustment and international generalisability. We aimed to determine whether use of ACEIs or ARBs is associated with an increased susceptibility to COVID-19 in patients with hypertension. MethodsIn this international, open science, cohort analysis, we used electronic health records from Spain (Information Systems for Research in Primary Care [SIDIAP]) and the USA (Columbia University Irving Medical Center data warehouse [CUIMC] and Department of Veterans Affairs Observational Medical Outcomes Partnership [VA-OMOP]) to identify patients aged 18 years or older with at least one prescription for ACEIs and ARBs (target cohort) or calcium channel blockers (CCBs) and thiazide or thiazide-like diuretics (THZs; comparator cohort) between Nov 1, 2019, and Jan 31, 2020. Users were defined separately as receiving either monotherapy with these four drug classes, or monotherapy or combination therapy (combination use) with other antihypertensive medications. We assessed four outcomes: COVID-19 diagnosis; hospital admission with COVID-19; hospital admission with pneumonia; and hospital admission with pneumonia, acute respiratory distress syndrome, acute kidney injury, or sepsis. We built large-scale propensity score methods derived through a data-driven approach and negative control experiments across ten pairwise comparisons, with results meta-analysed to generate 1280 study effects. For each study effect, we did negative control outcome experiments using a possible 123 controls identified through a data-rich algorithm. This process used a set of predefined baseline patient characteristics to provide the most accurate prediction of treatment and balance among patient cohorts across characteristics. The study is registered with the EU Post-Authorisation Studies register, EUPAS35296.Findings Among 1 355 349 antihypertensive users (363 785 ACEI or ARB monotherapy users, 248 915 CCB or THZ monotherapy users, 711 799 ACEI or ARB combination users, and 473 076 CCB or THZ combination users) included in analyses, no association was observed between COVID-19 diagnosis and exposure to ACEI or ARB monotherapy versus CCB or THZ monotherapy (calibrated hazard ratio [HR] 0•98, 95% CI 0•84-1•14) or combination use exposure (1•01, 0•90-1•15). ACEIs alone similarly showed no relative risk difference when compared with CCB or THZ monotherapy (HR 0•91, 95% CI 0•68-1•21; with heterogeneity of >40%) or combination use (0•95, 0•83-1•07). Directly comparing ACEIs with ARBs demonstrated a moderately lower risk with ACEIs, which was significant with combination use (HR 0•88, 95% CI 0•79-0•99) and non-significant for monotherapy (0•85, 0•69-1•05). We observed no significant difference between drug classes for risk of hospital admission with COVID-19, hospital admission with pneumonia, or hospital admission with pneumonia, acute res...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.