Laboratory-Reported Normal Value Ranges Should Not Be Used to Diagnose Periprosthetic Joint Infection

Self Cite

Introduction C-reactive protein (CRP) has long served as a prototypical biomarker for periprosthetic joint infection (PJI). Recently, synovial fluid (SF)-CRP has garnered interest as a diagnostic tool, with several studies demonstrating its diagnostic superiority over serum CRP for the diagnosis of PJI. Although previous studies have identified diagnostic thresholds for SF-CRP, they have been limited in scope and employed various CRP assays without formal validation for PJI diagnosis. This study aimed to conduct a formal single clinical laboratory validation to determine the optimal clinical decision limit of SF-CRP for the diagnosis of PJI. Methods A retrospective analysis of prospectively collected data was performed using receiver operating characteristic (ROC) and area under the curve (AUC) analyses. Synovial fluid samples from hip and knee arthroplasties, received from over 2,600 institutions, underwent clinical testing for PJI at a single clinical laboratory (CD Laboratories, Zimmer Biomet, Towson, MD) between 2017 and 2022. Samples were assayed for SF-CRP, alpha-defensin, white blood cell count, neutrophil percentage, and microbiological culture. After applying selection criteria, the samples were classified with the 2018 ICM PJI scoring system as "infected," "not infected," or "inconclusive." Data were divided into training and validation sets. The Youden Index was employed to optimize the clinical decision limit. Results A total of 96,061 samples formed the training (n = 67,242) and validation (n = 28,819) datasets. Analysis of the biomarker median values, culture positivity, anatomic distribution, and days from aspiration to testing revealed nearly identical specimen characteristics in both the training set and validation set. SF-CRP demonstrated an AUC of 0.929 (95% confidence interval (CI): 0.926-0.932) in the training set, with an optimal SF-CRP clinical decision limit for PJI diagnosis of 4.45 mg/L. Applying this cutoff to the validation dataset yielded a sensitivity of 86.1% (95% CI: 85.0-87.1%) and specificity of 87.1% (95% CI: 86.7-87.5%). No statistically significant difference in diagnostic performance was observed between the validation and training sets. Conclusion This study represents the largest single clinical laboratory evaluation of an SF-CRP assay for PJI diagnosis. The optimal CRP cutoff (4.45 mg/L) for PJI, which yielded a sensitivity of 86.1% and a specificity of 87.1%, is specific to the assay methodology and laboratory performing the assay. We propose that an SF-CRP test with a laboratory-validated optimal clinical decision limit for PJI may be preferable, in a clinical diagnostic setting, to serum CRP tests that do not have laboratory-validated clinical decision limits for PJI.

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Synovial Fluid C-reactive Protein Clinical Decision Limit and Diagnostic Accuracy for Periprosthetic Joint Infection

Miamidian,

Toler,

McLaren

et al. 2024

Self Cite

“…The trend favoring academic arthroplasty surgeons is again likely due to their increased familiarity with formally recommended testing thresholds and the manner in which to combine tests to match a PJI scoring system. The gap between physician diagnoses and the diagnosis of a PJI scoring system is not at all surprising considering the complexity of scoring systems and their well-described barriers to adoption [5,6], including low expert consensus [3], competing versions of the scoring system [1][2][3], multiple rules [1][2][3][4], and ambiguity in the laboratory test thresholds [10]. Therefore, while PJI scoring systems provide an objective standard by which to diagnose PJI in research, they may be too complex for routine use in clinical medicine [5,6].…”

Section: Discussionmentioning

confidence: 99%

“…The main advantage of these tests is that they are inexpensive due to their performance as a multipurpose test at most institutions. Unfortunately, these tests also may exhibit variability across laboratories [10,17] and require the physician to choose the appropriate PJI-optimized threshold to interpret the result, which may result in user error [10].…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Physician Use of Multiple Criteria to Diagnose Periprosthetic Joint Infection May Be Less Accurate Than the Use of an Individual Test

Deirmengian¹,

McLaren²,

Levine³

2022

Self Cite

IntroductionMultiple-criterion scoring systems for periprosthetic joint infection (PJI) can be algorithmically implemented in research, diagnostically outperforming individual tests. This improved performance may be lost in the practice setting, where clinicians rarely utilize strict algorithms. The ability of physicians to interpret multiple criteria for PJI and confront the complexity of combining them into a final diagnosis has never been studied. This study assessed the diagnostic characteristics of physicians using multiple criteria to diagnose PJI and compared the physicians' diagnostic accuracy to that of individual tests. MethodsA total of 12 physicians, including academic arthroplasty surgeons (N=4), community arthroplasty surgeons (N=4), and infectious disease (ID) specialists (N=4) were asked to use their routine clinical diagnostic practice to assign a diagnosis to 277 clinical vignettes using multiple preoperative laboratory criteria for PJI. The undecided rate, interobserver agreement, and accuracy of physicians were characterized relative to the 2013 Musculoskeletal Infection Society (MSIS) gold standard and compared to the accuracy of each individual laboratory test for PJI. ResultsPhysicians interpreting multiple criteria for PJI demonstrated high undecided diagnosis rates (mean=23.5%), poor interobserver agreement (kappa range=0.49-0.63), and mean accuracy of 90.8% (range:85.8%-97.4%) compared to the 2013 MSIS gold standard. The group of academic arthroplasty surgeons had a lower rate of undecided diagnoses than community arthroplasty surgeons (16.2% vs. 29.1%; p<0.0001) or ID specialists (16.2% vs. 25.1%; p<0.0001). Academic arthroplasty surgeons also exhibited a higher interobserver agreement than community arthroplasty surgeons (kappa = 0.63 (95%CI:0.59-0.68) vs. 0.49 (95%CI:0.44-0.54)).Mean physician accuracy (90.8%) was inferior to the alpha-defensin laboratory test (96.0%;p=0.0034) and the alpha-defensin lateral-flow test (94.6%;p=0.036), comparable to synovial fluid white blood cells (SF-WBC) (93.3%;p=0.17) and synovial fluid polymorphonuclear cell % (SF-PMN%) (94.0%;p=0.11), and superior to the erythrocyte sedimentation rate (ESR) (86.2%;p<0.0001) and C-reactive protein (CRP) (84.6%;p<0.0001). Only two academic arthroplasty surgeons in this study were able to outperform every individual test for PJI by combining multiple criteria to make a diagnosis. ConclusionAlthough multiple-criterion scoring systems may outperform individual tests for diagnosing PJI in the research setting, it appears that the complexity of using multiple tests to diagnose PJI causes indecision and variability among physicians. Physician use of multiple preoperative criteria to diagnose PJI is less accurate than the strict algorithmic calculation of the diagnosis as achieved in research. In fact, most physicians in this study would have improved their diagnostic accuracy for PJI by simply utilizing a single good test to make the diagnosis, instead of trying to combine multiple tests into a decision. We propose that l...

Achieving High Accuracy in Predicting the Probability of Periprosthetic Joint Infection From Synovial Fluid in Patients Undergoing Hip or Knee Arthroplasty: The Development and Validation of a Multivariable Machine Learning Algorithm

Paranjape,

Thai-Paquette,

Miamidian

et al. 2023

Self Cite

Background and objectiveThe current periprosthetic joint infection (PJI) diagnostic guidelines require clinicians to interpret and integrate multiple criteria into a complex scoring system. Also, PJI classifications are often inconclusive, failing to provide a clinical diagnosis. Machine learning (ML) models could be leveraged to reduce reliance on these complex systems and thereby reduce diagnostic uncertainty. This study aimed to develop an ML algorithm using synovial fluid (SF) test results to establish a PJI probability score. MethodsWe used a large clinical laboratory's dataset of SF samples, aspirated from patients with hip or knee arthroplasty as part of a PJI evaluation. Patient age and SF biomarkers [white blood cell count, neutrophil percentage (%PMN), red blood cell count, absorbance at 280 nm wavelength, C-reactive protein (CRP), alpha-defensin (AD), neutrophil elastase, and microbial antigen (MID) tests] were used for model development. Data preprocessing, principal component analysis, and unsupervised clustering (K-means) revealed four clusters of samples that naturally aggregated based on biomarker results. Analysis of the characteristics of each of these four clusters revealed three clusters (n=13,133) with samples having biomarker results typical of a PJI-negative classification and one cluster (n=4,032) with samples having biomarker results typical of a PJI-positive classification. A decision tree model, trained and tested independently of external diagnostic rules, was then developed to match the classification determined by the unsupervised clustering. The performance of the model was assessed versus a modified 2018 International Consensus Meeting (ICM) criteria, in both the test cohort and an independent unlabeled validation set of 5,601 samples. The SHAP (SHapley Additive exPlanations) method was used to explore feature importance. ResultsThe ML model showed an area under the curve of 0.993, with a sensitivity of 98.8%, specificity of 97.3%, positive predictive value (PPV) of 92.9%, and negative predictive value (NPV) of 99.8% in predicting the modified 2018 ICM diagnosis among test set samples. The model maintained its diagnostic accuracy in the validation cohort, yielding 99.1% sensitivity, 97.1% specificity, 91.9% PPV, and 99.9% NPV. The model's inconclusive rate (diagnostic probability between 20-80%) in the validation cohort was only 1.3%, lower than that observed with the modified 2018 ICM PJI classification (7.4%; p<0.001).The SHAP analysis found that AD was the most important feature in the model, exhibiting dominance among >95% of "infected" and "not infected" diagnoses. Other important features were the sum of the MID test panel, %PMN, and SF-CRP.