Routinely collected health data, obtained for administrative and clinical purposes without specific a priori research goals, are increasingly used for research. The rapid evolution and availability of these data have revealed issues not addressed by existing reporting guidelines, such as Strengthening the Reporting of Observational Studies in Epidemiology (STROBE). The REporting of studies Conducted using Observational Routinely collected health Data (RECORD) statement was created to fill these gaps. RECORD was created as an extension to the STROBE statement to address reporting items specific to observational studies using routinely collected health data. RECORD consists of a checklist of 13 items related to the title, abstract, introduction, methods, results, and discussion section of articles, and other information required for inclusion in such research reports. This document contains the checklist and explanatory and elaboration information to enhance the use of the checklist. Examples of good reporting for each RECORD checklist item are also included herein. This document, as well as the accompanying website and message board (http://www.record-statement.org), will enhance the implementation and understanding of RECORD. Through implementation of RECORD, authors, journals editors, and peer reviewers can encourage transparency of research reporting.
In pharmacoepidemiology, routinely collected data from electronic health records (including primary care databases, registries, and administrative healthcare claims) are a resource for research evaluating the real world effectiveness and safety of medicines. Currently available guidelines for the reporting of research using non-randomised, routinely collected data—specifically the REporting of studies Conducted using Observational Routinely collected health Data (RECORD) and the Strengthening the Reporting of OBservational studies in Epidemiology (STROBE) statements—do not capture the complexity of pharmacoepidemiological research. We have therefore extended the RECORD statement to include reporting guidelines specific to pharmacoepidemiological research (RECORD-PE). This article includes the RECORD-PE checklist (also available on www.record-statement.org) and explains each checklist item with examples of good reporting. We anticipate that increasing use of the RECORD-PE guidelines by researchers and endorsement and adherence by journal editors will improve the standards of reporting of pharmacoepidemiological research undertaken using routinely collected data. This improved transparency will benefit the research community, patient care, and ultimately improve public health.
Linkage of population-based administrative data is a valuable tool for combining detailed individual-level information from different sources for research. While not a substitute for classical studies based on primary data collection, analyses of linked administrative data can answer questions that require large sample sizes or detailed data on hard-to-reach populations, and generate evidence with a high level of external validity and applicability for policy making. There are unique challenges in the appropriate research use of linked administrative data, for example with respect to bias from linkage errors where records cannot be linked or are linked together incorrectly. For confidentiality and other reasons, the separation of data linkage processes and analysis of linked data is generally regarded as best practice. However, the ‘black box’ of data linkage can make it difficult for researchers to judge the reliability of the resulting linked data for their required purposes. This article aims to provide an overview of challenges in linking administrative data for research. We aim to increase understanding of the implications of (i) the data linkage environment and privacy preservation; (ii) the linkage process itself (including data preparation, and deterministic and probabilistic linkage methods) and (iii) linkage quality and potential bias in linked data. We draw on examples from a number of countries to illustrate a range of approaches for data linkage in different contexts.
Record linkage of administrative and survey data is increasingly used to generate evidence to inform policy and services. Although a powerful and efficient way of generating new information from existing data sets, errors related to data processing before, during and after linkage can bias results. However, researchers and users of linked data rarely have access to information that can be used to assess these biases or take them into account in analyses. As linked administrative data are increasingly used to provide evidence to guide policy and services, linkage error, which disproportionately affects disadvantaged groups, can undermine evidence for public health. We convened a group of researchers and experts from government data providers to develop guidance about the information that needs to be made available about the data linkage process, by data providers, data linkers, analysts and the researchers who write reports. The guidance goes beyond recommendations for information to be included in research reports. Our aim is to raise awareness of information that may be required at each step of the linkage pathway to improve the transparency, reproducibility, and accuracy of linkage processes, and the validity of analyses and interpretation of results.
ObjectiveLinkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England.Design and SettingRetrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013.ResultsOf 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England.ConclusionProbabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.
Linked datasets are an important resource for epidemiological and clinical studies, but linkage error can lead to biased results. For data security reasons, linkage of personal identifiers is often performed by a third party, making it difficult for researchers to assess the quality of the linked dataset in the context of specific research questions. This is compounded by a lack of guidance on how to determine the potential impact of linkage error. We describe how linkage quality can be evaluated and provide widely applicable guidance for both data providers and researchers. Using an illustrative example of a linked dataset of maternal and baby hospital records, we demonstrate three approaches for evaluating linkage quality: applying the linkage algorithm to a subset of gold standard data to quantify linkage error; comparing characteristics of linked and unlinked data to identify potential sources of bias; and evaluating the sensitivity of results to changes in the linkage procedure. These approaches can inform our understanding of the potential impact of linkage error and provide an opportunity to select the most appropriate linkage procedure for a specific analysis. Evaluating linkage quality in this way will improve the quality and transparency of epidemiological and clinical research using linked data.
BackgroundEvidence on the association between newborn length of hospital stay (LOS) and risk of readmission is conflicting. We compared methods for modelling this relationship, by gestational age, using population‐level hospital data on births in England between 2005–14.MethodsThe association between LOS and unplanned readmission within 30 days of postnatal discharge was explored using four approaches: (i) modelling hospital‐level LOS and readmission rates; (ii) comparing trends over time in LOS and readmission; (iii) modelling individual LOS and adjusted risk of readmission; and (iv) instrumental variable analyses (hospital‐level mean LOS and number of births on the same day).ResultsOf 4 667 827 babies, 5.2% were readmitted within 30 days. Aggregated data showed hospitals with longer mean LOS were not associated with lower readmission rates for vaginal (adjusted risk ratio (aRR) 0.87, 95% confidence interval (CI) 0.66, 1.13), or caesarean (aRR 0.89, 95% CI 0.72, 1.12) births. LOS fell by an average 2.0% per year for vaginal births and 3.4% for caesarean births, while readmission rates increased by 4.4 and 5.1% per year respectively. Approaches (iii) and (iv) indicated that longer LOS was associated with a reduced risk of readmission, but only for late preterm, vaginal births (34–36 completed weeks’ gestation).ConclusionsLonger newborn LOS may benefit late preterm babies, possibly due to increased medical or psychosocial support for those at greater risk of potentially preventable readmissions after birth. Research based on observational data to evaluate relationships between LOS and readmission should use methods to reduce the impact of unmeasured confounding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.