published in the May issue of the Academic Emergency Medicine. The aim of the authors was to evaluate diagnostic value of pointof-care ultrasound (POCUS) in clinical decision making of emergency physicians (EPs) for acute appendicitis (AA) in the emergency department (ED). 1 A total of 264 patients were included into a prospective observational clinical study and based on their results 169 (64%) had a diagnosis of AA. All patients were examined initially with POCUS by EPs and then with radiology-performed US (RADUS) by radiologists. The sensitivity, specificity, positive likelihood ratio (LR+), and negative likelihood ratio (LR--) of US examinations were 92.3%, 95.8%, 21.9, and 0.08 for POCUS and 76.9%, 97.8%, 36.4, and 0.24 for RADUS, respectively. The inter-rater agreement between EPs and radiologists was depicted by Cohen's kappa value. There was a moderate consistency between POCUS and RADUS results (j = 0.67; 95% confidence interval [CI] = 0.57-0.75).However, these results are not the most appropriate estimates to evaluate diagnostic value. Diagnostic value should be considered as diagnostic accuracy (validity) and diagnostic precision (reliability or agreement). Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), LR+ (ranging from 1 to infinity; the higher the LR+, the more accurate the test), LR-(ranging from 0 to 1; the lower the LR-, the more accurate the test) as well as odds ratio (ratio of true to false results) are the most appropriate estimates to evaluate validity of a test compared to a criterion standard. Therefore, it is better to report all these validity estimates together. Otherwise, final interpretation will be confusing because considering sensitivity and LR-, POCUS is more accurate than RADUS. However, by considering specificity and LR+, RADUS is more accurate than POCUS! Moreover, sensitivity and specificity reported in their results are more important for public health purposes, but PPV and NPV are more useful for clinical purposes. Furthermore, it is very important to know that for clinical purposes, reporting diagnostic added value should be considered using receiver operating characteristic (ROC) because all the above validity estimates can be acceptable while diagnostic added value may be clinically negligible. [2][3][4][5][6][7][8] Reliability (precision or agreement) as a different methodologic issue of the diagnostic value should also be assessed using appropriate estimate. For qualitative variables, weighted kappa should be used with caution. Two important weaknesses of Cohen's kappa to assess agreement of a qualitative variable are as follows. First, it depends on the prevalence in each category, which means it can be possible to have different j values having the same percentage for both concordant and discordant cells. Table 1 shows that in both (a) and (b) situations, the prevalences of concordant cells are 90%, and of discordant cells, 10%; however, we get different kappa values (0.44 as moderate and 0.80 as very good, respectively). K...