Matching is a powerful statistical tool in design and analysis. Conventional two-group, or bipartite, matching has been widely used in practice. However, its utility is limited to simpler designs. In contrast, nonbipartite matching is not limited to the two-group case, handling multiparty matching situations. It can be used to find the set of matches that minimize the sum of distances based on a given distance matrix. It brings greater flexibility to the matching design, such as multigroup comparisons. Thanks to improvements in computing power and freely available algorithms to solve nonbipartite problems, the cost in terms of computation time and complexity is low. This article reviews the optimal nonbipartite matching algorithm and its statistical applications, including observational studies with complex designs and an exact distribution-free test comparing two multivariate distributions. We also introduce an R package that performs optimal nonbipartite matching. We present an easily accessible web application to make nonbipartite matching freely available to general researchers.
AIMSOne barrier contributing to the lack of pharmacokinetic (PK) data in paediatric populations is the need for serial sampling. Analysis of clinically obtained specimens and data may overcome this barrier. To add evidence for the feasibility of this approach, we sought to determine PK parameters for fentanyl in children after cardiac surgery using specimens and data generated in the course of clinical care, without collecting additional blood samples. METHODSWe measured fentanyl concentrations in plasma from leftover clinically-obtained specimens in 130 paediatric cardiac surgery patients and successfully generated a PK dataset using drug dosing data extracted from electronic medical records. Using a population PK approach, we estimated PK parameters for this population, assessed model goodness-of-fit and internal model validation, and performed subset data analyses. Through simulation studies, we compared predicted fentanyl concentrations using model-driven weight-adjusted per kg vs. fixed per kg fentanyl dosing. RESULTSFentanyl clearance for a 6.4 kg child, the median weight in our cohort, is 5.7 l h -1 (2.2-9.2 l h -1 ), similar to values found in prior formal PK studies. Model assessment and subset analyses indicated the model adequately fit the data. Of the covariates studied, only weight significantly impacted fentanyl kinetics, but substantial inter-individual variability remained. In simulation studies, model-driven weight-adjusted per kg fentanyl dosing led to more consistent therapeutic fentanyl concentrations than fixed per kg dosing. CONCLUSIONSWe show here that population PK modelling using sparse remnant samples and electronic medical records data provides a powerful tool for assessment of drug kinetics and generation of individualized dosing regimens.
Postmarketing population pharmacokinetic (PK) and pharmacodynamic (PD) studies can be useful to capture patient characteristics affecting PK or PD in real‐world settings. These studies require longitudinally measured dose, outcomes, and covariates in large numbers of patients; however, prospective data collection is cost‐prohibitive. Electronic health records (EHRs) can be an excellent source for such data, but there are challenges, including accurate ascertainment of drug dose. We developed a standardized system to prepare datasets from EHRs for population PK/PD studies. Our system handles a variety of tasks involving data extraction from clinical text using a natural language processing algorithm, data processing, and data building. Applying this system, we performed a fentanyl population PK analysis, resulting in comparable parameter estimates to a prior study. This new system makes the EHR data extraction and preparation process more efficient and accurate and provides a powerful tool to facilitate postmarketing population PK/PD studies using information available in EHRs.
Purpose This paper introduces an improved tool for designing matched-pairs randomized trials. The tool allows the incorporation of clinical and other knowledge regarding the relative importance of variables used in matching and allows for multiple types of missing data. The method is illustrated in the context of a cluster-randomized trial. A web application and R package are introduced to implement the method and incorporate recent advances in the area. Methods Reweighted Mahalanobis Distance (RMD) matching incorporates user-specified weights and imputed values for missing data. Weight may be assigned to missingness indicators to match on missingness patterns. Three examples are presented, using real data from a cohort of 90 Veterans Health Administration sites that had at least 100 incident metformin users in 2007. Matching is utilized to balance seven factors aggregated at the site level. Covariate balance is assessed for 10,000 randomizations under each strategy: simple randomization, matched randomization using the Mahalanobis distance, and matched randomization using the RMD. Results The RMD matching achieved better balance than simple randomization or MD randomization. In the first example, simple and MD randomization resulted in a 10% chance of seeing an absolute mean difference of greater than 26% in the percent of nonwhite patients per site; the RMD dramatically reduced that to 6%. The RMD achieved significant improvement over simple randomization even with as much as 20% of the data missing. Conclusions RMD matching provides an easy-to-use tool that incorporates user knowledge and missing data.
Objective We developed medExtractR, a natural language processing system to extract medication information from clinical notes. Using a targeted approach, medExtractR focuses on individual drugs to facilitate creation of medication-specific research datasets from electronic health records. Materials and Methods Written using the R programming language, medExtractR combines lexicon dictionaries and regular expressions to identify relevant medication entities (eg, drug name, strength, frequency). MedExtractR was developed on notes from Vanderbilt University Medical Center, using medications prescribed with varying complexity. We evaluated medExtractR and compared it with 3 existing systems: MedEx, MedXN, and CLAMP (Clinical Language Annotation, Modeling, and Processing). We also demonstrated how medExtractR can be easily tuned for better performance on an outside dataset using the MIMIC-III (Medical Information Mart for Intensive Care III) database. Results On 50 test notes per development drug and 110 test notes for an additional drug, medExtractR achieved high overall performance (F-measures >0.95), exceeding performance of the 3 existing systems across all drugs. MedExtractR achieved the highest F-measure for each individual entity, except drug name and dose amount for allopurinol. With tuning and customization, medExtractR achieved F-measures >0.90 in the MIMIC-III dataset. Discussion The medExtractR system successfully extracted entities for medications of interest. High performance in entity-level extraction provides a strong foundation for developing robust research datasets for pharmacological research. When working with new datasets, medExtractR should be tuned on a small sample of notes before being broadly applied. Conclusions The medExtractR system achieved high performance extracting specific medications from clinical text, leading to higher-quality research datasets for drug-related studies than some existing general-purpose medication extraction tools.
Supplementary data are available at Bioinformatics online.
Objective: We developed medExtractR, a natural language processing system to extract medication dose and timing information from clinical notes. Our system facilitates creation of medication-specific research datasets from electronic health records. Materials and Methods: Written using the R programming language, medExtractR combines lexicon dictionaries and regular expression patterns to identify relevant medication information ('drug entities'). The system is designed to extract particular medications of interest, rather than all possible medications mentioned in a clinical note. MedExtractR was developed on notes from Vanderbilt University's Synthetic Derivative, using two medications (tacrolimus and lamotrigine) prescribed with varying complexity, and with a third drug (allopurinol) used for testing generalizability of results. We evaluated medExtractR and compared it to three existing systems: MedEx, MedXN, and CLAMP. Results: On 50 test notes for each development drug and 110 test notes for the additional drug, medExtractR achieved high overall performance (F-measures > 0.95). This exceeded the performance of the three existing systems across all drugs, with the exception of a couple specific entity-level evaluations including dose amount for lamotrigine and allopurinol. Discussion: MedExtractR successfully extracted medication entities for medications of interest. High performance in entity-level extraction tasks provides a strong foundation for developing robust research datasets for pharmacological research. However, its targeted approach provides a narrower scope compared with existing systems. Conclusion: MedExtractR (available as an R package) achieved high performance values in extracting specific medications from clinical text, leading to higher quality research datasets for drug-related studies than some existing general-purpose medication extraction tools.
Aim Evaluate performance of analytical strategies commonly used to adjust for baseline differences in continuous outcome variables for comparative effectiveness studies. Patients & methods Data simulations resembling a comparison of HbA1c values after initiation of antidiabetic treatments adjusting for baseline HbA1c. We evaluated change scores, analyses of covariance including linear, nonlinear with/without robust variance estimations, before and after optimal matching. We also evaluated the impact of measurement error. Results With increasing HbA1c baseline differences between groups, bias in effect estimates and suboptimal CI coverage probabilities increased in all approaches. These issues were further compounded by measurement error. Matching on baseline HbA1c, substantially mitigated these issues. Conclusion In comparative studies with continuous outcomes, matching on baseline values of the outcome variable improves analytical performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.