Heart attack or stroke occurs due to the narrowed or blocked blood vessels that are present in the heart. If not identified at the early stages of the disease, there is a high probability to loose life. An algorithm that takes the advantage of the inferences given by unsupervised learning is proposed. Our proposed framework leverages the unsupervised information that is hidden in the data to improve the accuracy of the classifier. By using clustering, a label is obtained and this cluster label is included as one of the features. This augmented feature set fed as input to the classifier to predict the disease. For clustering to avoid uncertainty, k-means and spectral clustering are used. If the cluster label predicted by both the methods match, then only that information is included as one of the features. To evaluate the effectiveness of the proposed algorithm two heart disease benchmark datasets, Framingham and stat-log are used. For the classification task Logistic regression, Naïve Baye's and SVM are being used to make the model simple yet effective. Based on the experimental results we claim that by augmenting the feature set with the cluster label using the proposed method significantly improves the performance of the heart disease prediction task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.