ObjectivesTo estimate data loss and bias in studies of Clinical Practice Research Datalink (CPRD) data that restrict analyses to Read codes, omitting anything recorded as text.DesignMatched case–control study.SettingPatients contributing data to the CPRD.Participants4915 bladder and 3635 pancreatic, cancer cases diagnosed between 1 January 2000 and 31 December 2009, matched on age, sex and general practitioner practice to up to 5 controls (bladder: n=21 718; pancreas: n=16 459). The analysis period was the year before cancer diagnosis.Primary and secondary outcome measuresFrequency of haematuria, jaundice and abdominal pain, grouped by recording style: Read code or text-only (ie, hidden text). The association between recording style and case–control status (χ2 test). For each feature, the odds ratio (OR; conditional logistic regression) and positive predictive value (PPV; Bayes’ theorem) for cancer, before and after addition of hidden text records.ResultsOf the 20 958 total records of the features, 7951 (38%) were recorded in hidden text. Hidden text recording was more strongly associated with controls than with cases for haematuria (140/336=42% vs 556/3147=18%) in bladder cancer (χ2 test, p<0.001), and for jaundice (21/31=67% vs 463/1565=30%, p<0.0001) and abdominal pain (323/1126=29% vs 397/1789=22%, p<0.001) in pancreatic cancer. Adding hidden text records corrected PPVs of haematuria for bladder cancer from 4.0% (95% CI 3.5% to 4.6%) to 2.9% (2.6% to 3.2%), and of jaundice for pancreatic cancer from 12.8% (7.3% to 21.6%) to 6.3% (4.5% to 8.7%). Adding hidden text records did not alter the PPV of abdominal pain for bladder (codes: 0.14%, 0.13% to 0.16% vs codes plus hidden text: 0.14%, 0.13% to 0.15%) or pancreatic (0.23%, 0.21% to 0.25% vs 0.21%, 0.20% to 0.22%) cancer.ConclusionsOmission of text records from CPRD studies introduces bias that inflates outcome measures for recognised alarm symptoms. This potentially reinforces clinicians’ views of the known importance of these symptoms, marginalising the significance of ‘low-risk but not no-risk’ symptoms.
Background:Pre-existing non-cancer conditions may complicate and delay colorectal cancer diagnosis.Method:Incident cases (aged ⩾40 years, 2007–2009) with colorectal cancer were identified in the Clinical Practice Research Datalink, UK. Diagnostic interval was defined as time from first symptomatic presentation of colorectal cancer to diagnosis. Comorbid conditions were classified as ‘competing demands’ (unrelated to colorectal cancer) or ‘alternative explanations’ (sharing symptoms with colorectal cancer). The association between diagnostic interval (log-transformed) and age, gender, consultation rate and number of comorbid conditions was investigated using linear regressions, reported using geometric means.Results:Out of the 4512 patients included, 72.9% had ⩾1 competing demand and 31.3% had ⩾1 alternative explanation. In the regression model, the numbers of both types of comorbid conditions were independently associated with longer diagnostic interval: a single competing demand delayed diagnosis by 10 days, and four or more by 32 days; and a single alternative explanation by 9 days. For individual conditions, the longest delay was observed for inflammatory bowel disease (26 days; 95% CI 14–39).Conclusions:The burden and nature of comorbidity is associated with delayed diagnosis in colorectal cancer, particularly in patients aged ⩾80 years. Effective clinical strategies are needed for shortening diagnostic interval in patients with comorbidity.
ObjectiveAnalysis of routinely collected electronic health record (EHR) data from primary care is reliant on the creation of codelists to define clinical features of interest. To improve scientific rigour, transparency and replicability, we describe and demonstrate a standardised reproducible methodology for clinical codelist development.DesignWe describe a three-stage process for developing clinical codelists. First, the clear definition a priori of the clinical feature of interest using reliable clinical resources. Second, development of a list of potential codes using statistical software to comprehensively search all available codes. Third, a modified Delphi process to reach consensus between primary care practitioners on the most relevant codes, including the generation of an ‘uncertainty’ variable to allow sensitivity analysis.SettingThese methods are illustrated by developing a codelist for shortness of breath in a primary care EHR sample, including modifiable syntax for commonly used statistical software.ParticipantsThe codelist was used to estimate the frequency of shortness of breath in a cohort of 28 216 patients aged over 18 years who received an incident diagnosis of lung cancer between 1 January 2000 and 30 November 2016 in the Clinical Practice Research Datalink (CPRD).ResultsOf 78 candidate codes, 29 were excluded as inappropriate. Complete agreement was reached for 44 (90%) of the remaining codes, with partial disagreement over 5 (10%). 13 091 episodes of shortness of breath were identified in the cohort of 28 216 patients. Sensitivity analysis demonstrates that codes with the greatest uncertainty tend to be rarely used in clinical practice.ConclusionsAlthough initially time consuming, using a rigorous and reproducible method for codelist generation ‘future-proofs’ findings and an auditable, modifiable syntax for codelist generation enables sharing and replication of EHR studies. Published codelists should be badged by quality and report the methods of codelist generation including: definitions and justifications associated with each codelist; the syntax or search method; the number of candidate codes identified; and the categorisation of codes after Delphi review.
Pictograms have the potential to help patients understand information on drug therapy. This study shows that some existing pictograms are not easily interpreted and that testing is needed before their implementation. A reduction in their size to allow incorporation into conventional written formats may cause additional problems for patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.