The body's imbalanced glucose consumption caused type 2 diabetes, which in turn caused problems with the immunological, neurological, and circulatory systems. Numerous studies have been conducted to predict this illness using a variety of clinical and pathological criteria. As technology has advanced, several machine learning approaches have also been used for improved prediction accuracy. This study examines the concept of data preparation and examines how it affects machine learning algorithms. Two datasets were built up for the experiment: LS, a locally developed and verified dataset, and PIMA, a dataset from Kaggle. In all, the research evaluates five machine learning algorithms and eight distinct scaling strategies. It has been noted that the accuracy of the PIMA data set ranges from 46.99 to 69.88% when no pre-processing is used, and it may reach 77.92% when scalers are used. Because the LS data set is tiny and regulated, accuracy for the dataset without scalers may be as low as 78.67%. With two labels, accuracy increases to 100%.