The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification

Chicco, Davide; Jurman, Giuseppe

doi:10.1186/s13040-023-00322-4

Cited by 113 publications

(67 citation statements)

References 66 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Generally, all annotators showed high correlation [ 20 ] to “gold standard” annotations of CXR text reports ( Table 2 a,b). This finding was comparable to a previous study which showed a similar level of agreement between radiologists and non-radiological physicians and medical students when reading and comprehending radiology reports [ 26 ].…”

Section: Discussionmentioning

confidence: 99%

“…Matthew’s correlation coefficient (MCC) [ 20 ] was used to compare annotator performance to “gold standard” labeling and to compare annotators’ performance to each other. The MCC was based on values selected for a 2 × 2 confusion matrix ( Table 1 ) where true positive (TP) described the number of labels that matched “gold standard” labels for all positive and negative findings separately.…”

Section: Methodsmentioning

confidence: 99%

“…To achieve this, MCC was calculated using Python 3.8.10 ( ) with the Pandas [ 21 ] and Numpy [ 22 ] libraries for each label and then micro-averaged [ 23 ] to give an overall coefficient for all positive and negative labels. MCC ranges between −1 and 1, where 1 represents perfect positive correlation, 0 represents correlation not better than random, and −1 represents total disagreement between labels of the “gold standard” set (actual) and the set of labels chosen by the annotator (predicted) [ 20 ].…”

Section: Methodsmentioning

confidence: 99%

See 2 more Smart Citations

Performance and Agreement When Annotating Chest X-ray Text Reports—A Preliminary Step in the Development of a Deep Learning-Based Prioritization and Detection System

Pehrson

Bonnevie³

et al. 2023

Diagnostics

View full text Add to dashboard Cite

A chest X-ray report is a communicative tool and can be used as data for developing artificial intelligence-based decision support systems. For both, consistent understanding and labeling is important. Our aim was to investigate how readers would comprehend and annotate 200 chest X-ray reports. Reports written between 1 January 2015 and 11 March 2022 were selected based on search words. Annotators included three board-certified radiologists, two trained radiologists (physicians), two radiographers (radiological technicians), a non-radiological physician, and a medical student. Consensus labels by two or more of the experienced radiologists were considered “gold standard”. Matthew’s correlation coefficient (MCC) was calculated to assess annotation performance, and descriptive statistics were used to assess agreement between individual annotators and labels. The intermediate radiologist had the best correlation to “gold standard” (MCC 0.77). This was followed by the novice radiologist and medical student (MCC 0.71 for both), the novice radiographer (MCC 0.65), non-radiological physician (MCC 0.64), and experienced radiographer (MCC 0.57). Our findings showed that for developing an artificial intelligence-based support system, if trained radiologists are not available, annotations from non-radiological annotators with basic and general knowledge may be more aligned with radiologists compared to annotations from sub-specialized medical staff, if their sub-specialization is outside of diagnostic radiology.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Performance and Agreement When Annotating Chest X-ray Text Reports—A Preliminary Step in the Development of a Deep Learning-Based Prioritization and Detection System

Pehrson

Bonnevie³

et al. 2023

Diagnostics

View full text Add to dashboard Cite

show abstract

“…Matthews Correlation Coefficient is used as a single metric for direct comparison between the developed models. Because MCC is known to be superior to any other performance metric for ranking the binary classification models …”

Section: Methodsmentioning

confidence: 99%

“…Because MCC is known to be superior to any other performance metric for ranking the binary classification models. 25…”

Section: Machine Learning Algorithmsmentioning

confidence: 99%

Machine Learning Enables Accurate Prediction of Quinone Formation during Drug Metabolism

Sandhu,

Garg

2023

Chem. Res. Toxicol.

View full text Add to dashboard Cite

Metabolism helps in the elimination of drugs from the human body by making them more hydrophilic. Sometimes, drugs can be bioactivated to highly reactive metabolites or intermediates during metabolism. These reactive metabolites are often responsible for the toxicities associated with the drugs. Identification of reactive metabolites of drug candidates can be very helpful in the initial stages of drug discovery. Quinones are soft electrophiles that are generated as reactive intermediates during metabolism. Quinones make up more than 40% of the reactive metabolites. In this work, a reliable data set of 510 molecules was used to develop machine learning and deep learning-based predictive models to predict the formation of quinone-type metabolites. For representing molecules, two-dimensional (2D) descriptors, PubChem fingerprints, electro-topological state (E-state) fingerprints, and metabolic reactivity-based descriptors were used. Developed models were compared to the existing Xenosite web server using the untouched test set of 102 molecules. The best model achieved an accuracy of 86.27%, while the Xenosite server could achieve an accuracy of only 52.94% on the test set. Descriptor analysis revealed that the presence of greater numbers of polar moieties in a molecule can prevent the formation of quinone-type metabolites. In addition, the presence of a nitrogen atom in an aromatic ring and the presence of metabolophores V51, V52, and V53 (SMARTCyp descriptors) decrease the probability of quinone formation. Finally, a tool based on the best machine learning models was developed, which is accessible at http://14.139.57.41/quinonepred/.

show abstract

Unplanned Hospitalization Prediction During Chemoradiotherapy Via Machine Learning Classifiers

Bai,

Cui,

Niu

2024

JAMA Oncol

View full text Add to dashboard Cite

The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification

Cited by 113 publications

References 66 publications

Performance and Agreement When Annotating Chest X-ray Text Reports—A Preliminary Step in the Development of a Deep Learning-Based Prioritization and Detection System

Performance and Agreement When Annotating Chest X-ray Text Reports—A Preliminary Step in the Development of a Deep Learning-Based Prioritization and Detection System

Machine Learning Enables Accurate Prediction of Quinone Formation during Drug Metabolism

Unplanned Hospitalization Prediction During Chemoradiotherapy Via Machine Learning Classifiers

Contact Info

Product

Resources

About