This paper presents a comparison of three strategies for managing the imbalance problem: undersampling, SMOTE and Weighted SVM. Undersampling is a strategy where the samples of the majority class are discarded; SMOTE (Syn thetic Minority Over-sampling Technique) is a method in which synthetic samples of the minority class are added to the dataset; Weighted SVM keeps the number of samples of each class but assigns weights in the training of the SVM, so it can improve its performance for the minority class. Results show that Weighted SVM and SMOTE achieved comparable results, although Weighted SVM requires less computational effort. Undersampling, on the other hand, achieved lower performance results presumably due to the loss of information produced when discarding data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.