Cancer is the leading disease in the world by the increasing number of new patients and deaths every year. Hence, it is the most feared disease of our time. It is believed that lung cancer and breast cancer are most common types of cancer and they both are subtypes of the same group of cancer – carcinoma. With this type of cancer early detection is of great importance for patient survival. As it is the disease that has unfortunately been around for many years, today we have datasets with all necessary information for diagnosing and predicting cancer. Predicting cancer means deciding if the cancer is malignant or benign. The key to this answer lays in different values of parameters that have been stored when the disease was discovered. Machine learning plays the crucial role in predicting cancer, given the fact that algorithms such as Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and etc. are designed to find the pattern that occurs in large sets of data and based on that make a decision. In this paper, author's goal is to see how machine learning and its practical implementation on public datasets can help with early breast cancer diagnosis and hopefully help save more lives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.