Aims To develop a computer processable algorithm, capable of running automated searches of routine data that flag miscoded and misclassified cases of diabetes for subsequent clinical review. Method Anonymized computer data from the Quality Improvement in Chronic Kidney Disease (QICKD) trial (n = 942 031) were analysed using a binary method to assess the accuracy of data on diabetes diagnosis. Diagnostic codes were processed and stratified into: definite, probable and possible diagnosis of Type 1 or Type 2 diabetes. Diagnostic accuracy was improved by using prescription compatibility and temporally sequenced anthropomorphic and biochemical data. Bayesian false detection rate analysis was used to compare findings with those of an entirely independent and more complex manual sort of the first round QICKD study data (n = 760 588). Results The prevalence of definite diagnosis of Type 1 diabetes and Type 2 diabetes were 0.32% and 3.27% respectively when using the binary search method. Up to 35% of Type 1 diabetes and 0.1% of Type 2 diabetes were miscoded or misclassified on the basis of age/BMI and coding. False detection rate analysis demonstrated a close correlation between the new method and the published hand-crafted sort. Both methods had the highest false detection rate values when coding, therapeutic, anthropomorphic and biochemical filters were used (up to 90% for the new and 75% for the handcrafted search method). Conclusions A simple computerized algorithm achieves very similar results to more complex search strategies to identify miscoded and misclassified cases of both Type 1 diabetes and Type 2 diabetes. It has the potential to be used as an automated audit instrument to improve quality of diabetes diagnosis.Keywords diabetes mellitus, diagnostic errors, computerized medical informatics, medical records systems.
IntroductionPrimary care records are largely computerised with most primary care clinicians entering data at the point of care. Electronic patient records facilitate improved management of chronic diseases including diabetes [1]. However, a combination of factors may reduce data quality, including: time pressures that limit what is recorded in the electronic patient record; changes in diagnostic criteria and classification rules for diabetes [2]; sampling labelling problems (it may not be obvious if a sample is fasted) [3]; and idiosyncrasy in the coding interface. Furthermore, the various brands of EPR offer different coding choices for the same search term [4].We have demonstrated that a pragmatic search strategy can identify classification problems in diabetes. Many people labelled as having Type 1 diabetes (Type 1 diabetes) are misclassified and actually have Type 2 diabetes mellitus (Type 2 diabetes) and some with Type 2 diabetes may not have diabetes at all [5]. Coding of diabetes diagnostic data is both complex [6] and in need of refinement [5,7] and current practice probably overestimates the prevalence of diabetes [5]. Recognition of these gaps has led NHS Diabetes [8] and the Royal Coll...