Recall-based Machine Learning approach for early detection of Cervical Cancer

Gupta, Apoorva; Anand, Amitesh; Hasija, Yasha

doi:10.1109/i2ct51068.2021.9418099

Cited by 11 publications

(2 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The recall scores for Hinselmann, Schiller, cytology and biopsy were 0.920, 0.972, 0.912 and 0.996, respectively. 29 High performance can be achieved by reducing variance and bias in ML models. To achieve this, Ahishakiye et al used an ensemble ML classifier including a decision tree, Classification and Regression Trees, Naïve Bayes Classifier, K-Nearest Neighbour and Support Vector Machine.…”

Section: Discussionmentioning

confidence: 99%

Optimised feature selection and cervical cancer prediction using Machine learning classification

Tak¹,

Parihar²,

Singh³

et al. 2022

Scripta Medica

View full text Add to dashboard Cite

Background: Screening and early detection play a key role in cervical cancer prevention. The present study predicts the outcome of various diagnostic tests used to diagnose cervical cancer using machine learning algorithms. Methods: The present study ran various cervical cancer risk factors on a machine learning (ML) classifier to predict outcomes of Hinselmann, Schiller, cytology and biopsy. The dataset is publicly available on the Machine Learning Repository website of the University of California Irvine. The imbalanced dataset was pre-processed using oversampling methods. The significantly varied features between the two levels of a response variable were used to train the machine learning classifiers on MATLAB. The classifiers used were Decision Trees, Support Vector Machine, K-Nearest Neighbours and Ensemble learning classifiers. The performance metrics of the classifiers were expressed as accuracy, the area under the receiver operator characteristic (AU-ROC) curve, sensitivity and specificity. Results: The Fine Gaussian SVM classifier was the best to classify Hinselmann, cytology and biopsy with the accuracy of 97.5 %, 62.5 % and 98 %, respectively. However, Boosted trees performed best in the classification of Schiller with 81.3 % accuracy. Conclusion: The present study selected optimised features among multiple risk factors to train various ML classifiers to predict cervical cancer.

show abstract

Section: Discussionmentioning

confidence: 99%

Optimised feature selection and cervical cancer prediction using Machine learning classification

Tak¹,

Parihar²,

Singh³

et al. 2022

Scripta Medica

View full text Add to dashboard Cite

show abstract

“…Nevertheless, particularly in healthcare applications, where the imbalanced distribution of medical conditions is commonplace, the sole reliance on accuracy can be misleading and potentially detrimental to patient outcomes Afrose et al (2022). Also the literature of ML in healthcare suggests to empathises on positive class recall more than overall accuracy Gupta and Sedamkar (2020); Gupta et al (2021); Shinde and Singh (2023). If a model achieves high accuracy by predominantly classifying cases as negative, it might inadvertently overlook genuine instances of the disease, leading to delayed diagnoses and compromised patient well-being.…”

Section: Introductionmentioning

confidence: 99%

Enhancing Fairness and Accuracy in Diagnosing Type 2 Diabetes in Young Population

Pias

Tang

et al. 2023

Preprint

View full text Add to dashboard Cite

The use of machine learning in healthcare has grown rapidly in recent years, with the potential to improve diagnosis, treatment, and patient outcomes. However, issues of bias and fairness in these models must be addressed to ensure equitable treatment for all patients regardless of their gender. In this study, we evaluated the effectiveness of undersampling to balance the dataset both in terms of positive-negative samples and male-female gender groups and create a more robust and accurate machine learning model in predicting diabetes and prediabetes. In this study, we applied multiple machine learning classifiers to the Behavioral Risk Factor Surveillance System (BRFSS 2015) dataset, which is inherently imbalanced. We tested four algorithms - Logistic Regression, Random Forest, K-Nearest Neighbors, and Multilayer Perceptron - using both the original and the balanced datasets. Our results indicate that balancing the dataset through undersampling improves the performance of all four algorithms in predicting diabetes. Specifically, Logistic Regression showed an increase in Precision from 0.53 to 0.74, while Random Forest showed an increase in F1-score from 0.25 to 0.75. K-Nearest Neighbors showed an increase in Recall from 0.19 to 0.73, and Multilayer Perceptron showed a significant increase in F1-score from 0.38 to 0.78. Moreover, our findings reveal that undersampling can improve fairness by mitigating gender bias from machine learning models, as measured by the Disparate Impact Ratio (DIR). The experiments depict a change in DIR for most algorithms when trained on the balanced dataset, for logistic regression, random forest, and k-nearest neighbour DIR decreases from 1.16 to 1.12, 1.14 to 1.08, and 1.05 to 1.02 respectively, indicating that the models are becoming fairer towards both gender groups. Our study demonstrates that undersampling can be a promising step towards creating more balanced, fair, and accurate machine learning models for gender subgroups.

show abstract

D-AE: A Discriminant Encode-Decode Nets for Data Generation

Wang,

Song,

et al. 2024

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

View full text Add to dashboard Cite

Recall-based Machine Learning approach for early detection of Cervical Cancer

Cited by 11 publications

References 8 publications

Optimised feature selection and cervical cancer prediction using Machine learning classification

Optimised feature selection and cervical cancer prediction using Machine learning classification

Enhancing Fairness and Accuracy in Diagnosing Type 2 Diabetes in Young Population

D-AE: A Discriminant Encode-Decode Nets for Data Generation

Contact Info

Product

Resources

About