Electronic health record (EHR)-derived real-world data (RWD) can be sourced to create external comparator cohorts to oncology clinical trials. This exploratory study assessed whether EHR-derived patient cohorts could emulate select clinical trial control arms across multiple tumor types. The impact of analytic decisions on emulation results was also evaluated. By digitizing Kaplan-Meier curves, we reconstructed published control arm results from 15 trials that supported drug approvals from January 1, 2016, to April 30, 2018. RWD cohorts were constructed using a nationwide EHR-derived de-identified database by aligning eligibility criteria and weighting to trial baseline characteristics. Trial data and RWD cohorts were compared using Kaplan-Meier and Cox proportional hazards regression models for progression-free survival (PFS) and overall survival (OS; individual cohorts) and multitumor random effects models of hazard ratios (HRs) for median endpoint correlations (across cohorts). Post hoc, the impact of specific analytic decisions on endpoints was assessed using a case study. Comparing trial data and weighted RWD cohorts, PFS results were more similar (HR range = 0.63-1.18, pooled HR = 0.84, correlation of median = 0.91) compared to OS (HR range = 0.36-1.09, pooled HR = 0.76, correlation of median = 0.85). OS HRs were more variable and trended toward worse for RWD cohorts. The post hoc case study had OS HR ranging from 0.67 (95% confidence interval (CI): 0.56-0.79) to 0.92 (95% CI: 0.78-1.09) depending on specific analytic decisions. EHR-derived RWD can emulate oncology clinical trial control arm results, although with variability. Visibility into clinical trial cohort characteristics may shape and refine analytic approaches.Contextualizing drug efficacy data from single-arm and small randomized clinical trials (RCTs) using robust external data sources and analytical methodologies is critical, especially in the regulatory approval setting for treatment of diseases that are rare or have high unmet medical need.
AbstractObjectivesWhile understanding of complex within-person clustering of health behaviors into meaningful profiles of risk is growing, we still know little about whether and how U.S. adults transition from one profile to another as they age. This study assesses patterns of stability and change in profiles of tobacco and alcohol use and body mass index (BMI).MethodA nationally representative cohort of U.S. adults 25 years and older was interviewed up to 5 times between 1986 and 2011. Latent transition analysis (LTA) models characterized the most common profiles, patterning of transitions across profiles over follow-up, and assessed whether some were associated with higher mortality risk.ResultsWe identified 5 profiles: “health promoting” with normal BMI and moderate alcohol consumption; “overweight”; “current smokers”; “obese”; and “nondrinkers”. Profile membership was largely stable, with the most common transitions to death or weight gain. “Obese” was the most stable profile, while “smokers” were most likely to transition to another profile. Mortality was most frequent in the “obese” and “nondrinker” profiles.DiscussionStability was more common than transition, suggesting that adults sort into health behavior profiles relatively early. Women and men were differently distributed across profiles at baseline, but showed broad similarity in transitions.
Researchers in genetics and other life sciences commonly use permutation tests to evaluate differences between groups. Permutation tests have desirable properties, including exactness if data are exchangeable, and are applicable even when the distribution of the test statistic is analytically intractable. However, permutation tests can be computationally intensive. We propose both an asymptotic approximation and a resampling algorithm for quickly estimating small permutation p-values (e.g., <10-6) for the difference and ratio of means in two-sample tests. Our methods are based on the distribution of test statistics within and across partitions of the permutations, which we define. In this article, we present our methods and demonstrate their use through simulations and an application to cancer genomic data. Through simulations, we find that our resampling algorithm is more computationally efficient than another leading alternative, particularly for extremely small p-values (e.g., <10-30). Through application to cancer genomic data, we find that our methods can successfully identify up- and down-regulated genes. While we focus on the difference and ratio of means, we speculate that our approaches may work in other settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.