Background: This study aims to show the impact of imbalanced data and the typical evaluation methods in developing and misleading assessments of machine learning-based models for preoperative thyroid nodules screening. Study design: A retrospective study. Methods: The ultrasonography features for 431 thyroid nodules cases were extracted from medical records of 313 patients in Babol, Iran. Since thyroid nodules are commonly benign, the relevant data are usually unbalanced in classes. It can lead to the bias of learning models toward the majority class. To solve it, a hybrid resampling method called the Smote-was used to creating balance data. Following that, the support vector classification (SVC) algorithm was trained by balance and unbalanced datasets as Models 2 and 3, respectively, in Python language programming. Their performance was then compared with the logistic regression model as Model 1 that fitted traditionally. Results: The prevalence of malignant nodules was obtained at 14% (n = 61). In addition, 87% of the patients in this study were women. However, there was no difference in the prevalence of malignancy for gender. Furthermore, the accuracy, area under the curve, and geometric mean values were estimated at 92.1%, 93.2%, and 76.8% for Model 1, 91.3%, 93%, and 77.6% for Model 2, and finally, 91%, 92.6% and 84.2% for Model 3, respectively. Similarly, the results identified Micro calcification, Taller than wide shape, as well as lack of ISO and hyperechogenicity features as the most effective malignant variables. Conclusion: Paying attention to data challenges, such as data imbalances, and using proper criteria measures can improve the performance of machine learning models for preoperative thyroid nodules screening.
Background This study sought to provide machine learning-based classification models to predict the success of intrauterine insemination (IUI) therapy. Additionally, we sought to illustrate the effect of models fitting with balanced data vs original data with imbalanced data labels using two different types of resampling methods. Finally, we fit models with all features against optimized feature sets using various feature selection techniques. Methods The data for the cross-sectional study were collected from 546 infertile couples with IUI at the Fatemehzahra Infertility Research Center, Babol, North of Iran. Logistic regression (LR), support vector classification, random forest, Extreme Gradient Boosting (XGBoost) and, Stacking generalization (Stack) as the machine learning classifiers were used to predict IUI success by Python v3.7. We employed the Smote-Tomek (Stomek) and Smote-ENN (SENN) resampling methods to address the imbalance problem in the original dataset. Furthermore, to increase the performance of the models, mutual information classification (MIC-FS), genetic algorithm (GA-FS), and random forest (RF-FS) were used to select the ideal feature sets for model development. Results In this study, 28% of patients undergoing IUI treatment obtained a successful pregnancy. Also, the average age of women and men was 24.98 and 29.85 years, respectively. The calibration plot in this study for IUI success prediction by machine learning models showed that between feature selection methods, the RF-FS, and among the datasets used to fit the models, the balanced dataset with the Stomek method had well-calibrating predictions than other methods. Finally, the brier scores for the LR, SVC, RF, XGBoost, and Stack models that were fitted utilizing the Stomek dataset and the chosen feature set using the Random Forest technique obtained equal to 0.202, 0.183, 0.158, 0.129, and 0.134, respectively. It showed duration of infertility, male and female age, sperm concentration, and sperm motility grading score as the most predictable factors in IUI success. Conclusion The results of this study with the XGBoost prediction model can be used to foretell the individual success of IUI for each couple before initiating therapy.
Objectives Androgen receptor (AR) play a key role in the onset and progression of prostate cancer. Epigallocatechin-3-gallate (EGCG) is a polyphenolic compound and the active ingredient in green tea, which is involved in modulating gene expression through epigenetic alterations. Previous studies have shown that EGCG at low concentrations reduces the expression of AR and prostate-specific antigen (PSA) in the LNCaP cell line of prostate cancer. In this study, the effect of higher EGCG concentrations on AR and PSA expression in LNCaP prostate cancer cell line was investigated. Methods In this study, LNCaP prostate cancer cell line was used and after MTT test, concentrations of 40, 60 and 80 μg/mL EGCG were used for treatment. Then, the expression of AR and PSA genes was evaluated by RT-PCR. AR protein expression was also assessed by Western blotting. Results The present study showed that treatment of LNCaPs cells by EGCG reduces cell proliferation. The IC50 value was 42.7 μg/mL under experimental conditions. It was also observed that EGCG at concentrations of 40 and 80 μg/mL increased the expression of AR and PSA (p<0.05). Conclusions The present study showed that the effect of EGCG on AR expression was different at different concentrations, so that unlike previous studies, higher concentrations of EGCG (80 and 40 μg/mL) increased AR and PSA expression. It seems that due to the toxic effects of EGCG in high concentrations on cancer cells and the possibility of its effect on normal cells, more caution should be exercised in its use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.