The objective of this research is to investigate the randomization of data on a computer based feature selection for diagnosing coronary artery disease. The randomization on Cleveland dataset was conducted because the performance value is different for each experiment. Assuming the performance values have a Gaussian probability distribution is a solution to handle different performance value provided by the process of randomizing dataset. The final performance is taken from the mean value of all performance value. In this research, computer based feature selection (CFS), medical expert based feature selection (MFS) and combined both of MFS and CFS (MFS+CFS) are also conducted to improve the performance of the classification algorithm. Also, this research found a different characteristic on Cleveland dataset from previous work. This difference obviously can affect the feature selection result and the final performance. In summary, the randomization dataset and computing the final performance can generally represent the performance of the classification algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.