The use of machine learning in healthcare has grown rapidly in recent years, with the potential to improve diagnosis, treatment, and patient outcomes. However, issues of bias and fairness in these models must be addressed to ensure equitable treatment for all patients regardless of their gender. In this study, we evaluated the effectiveness of undersampling to balance the dataset both in terms of positive-negative samples and male-female gender groups and create a more robust and accurate machine learning model in predicting diabetes and prediabetes. In this study, we applied multiple machine learning classifiers to the Behavioral Risk Factor Surveillance System (BRFSS 2015) dataset, which is inherently imbalanced. We tested four algorithms - Logistic Regression, Random Forest, K-Nearest Neighbors, and Multilayer Perceptron - using both the original and the balanced datasets. Our results indicate that balancing the dataset through undersampling improves the performance of all four algorithms in predicting diabetes. Specifically, Logistic Regression showed an increase in Precision from 0.53 to 0.74, while Random Forest showed an increase in F1-score from 0.25 to 0.75. K-Nearest Neighbors showed an increase in Recall from 0.19 to 0.73, and Multilayer Perceptron showed a significant increase in F1-score from 0.38 to 0.78. Moreover, our findings reveal that undersampling can improve fairness by mitigating gender bias from machine learning models, as measured by the Disparate Impact Ratio (DIR). The experiments depict a change in DIR for most algorithms when trained on the balanced dataset, for logistic regression, random forest, and k-nearest neighbour DIR decreases from 1.16 to 1.12, 1.14 to 1.08, and 1.05 to 1.02 respectively, indicating that the models are becoming fairer towards both gender groups. Our study demonstrates that undersampling can be a promising step towards creating more balanced, fair, and accurate machine learning models for gender subgroups.