Importance
Parathyroidectomy offers the only cure for primary hyperparathyroidism (PHPT), but today only 50% of PHPT patients are referred for surgery, in large part because the condition is widely under-recognized. PHPT diagnosis can be especially challenging with mild biochemical indices. Machine learning (ML) is a collection of methods in which computers build predictive algorithms based on labeled examples.
Objective
With the aim of facilitating diagnosis, we tested the ability of ML to distinguish PHPT from normal physiology using clinical and laboratory data.
Design
This is a retrospective cohort study using a labeled training set and 10-fold cross-validation to evaluate algorithm accuracy. Measures of accuracy included area under the ROC curve, precision (sensitivity), and positive and negative predictive value. Several different ML algorithms and ensembles of algorithms were tested using the Weka platform.
Setting
3 high-volume endocrine surgery programs
Participants
Among 11,830 patients managed surgically from March, 2001 to August 2013, 6,777 underwent parathyroidectomy for PHPT, and 5,053 control patients without PHPT underwent thyroidectomy.
Main Outcomes and Measures
Test-set accuracies for ML models were determined using 10-fold cross-validation. Age, gender, preoperative calcium, phosphate, PTH, Vitamin D, and creatinine were defined as potential predictors of PHPT. Mild PHPT was defined as PHPT with normal preoperative calcium or PTH levels.
Results
After testing a variety of ML algorithms, Bayesian network models proved most accurate, correctly classifying 95.2% of all PHPT patients (area under ROC=0.989). Omitting PTH from the model did not significantly reduce the accuracy (area under ROC = 0.985). However, in mild disease cases, the Bayesian network model correctly classified 71.1% of patients with normal calcium and 92.1% with normal PTH levels preoperatively. Bayesian networking + AdaBoost improved the accuracy to 97.2% correctly classified (area under ROC=0.994) cases, and 91.9% of PHPT patients with mild disease. This was significantly improved relative to Bayesian networking alone (p<0.0001).
Conclusions and Relevance
ML can accurately diagnose PHPT without human input, even in mild disease. Incorporation of this tool into electronic medical record systems may greatly aid in recognition of this under-diagnosed disorder.