Class-imbalance problem is the problem that the number, or data, in the majority class is much more than in the minority class. Traditional classifiers cannot sort out this problem because they focus on the data in the majority class than on the data in the minority class, and then they predict some upcoming data as the data in the majority class. Under-sampling is an efficient way to handle this problem because this method selects the representatives of the data in the majority class. For this reason, under-sampling occupies shorter training period than over-sampling. The only problem with the under-sampling method is that a representative selection, in all probability, throws away important information in a majority class. To overcome this problem, we propose a cluster-based undersampling method. We use a clustering algorithm that is performance guaranteed, named k-centers algorithm, which clusters the data in the majority class and selects a number of representative data in many proportions, and then combines them with all the data in the minority class as a training set. In this paper, we compare our approach with k-means on five datasets from UCI with two classifiers: 5-nearest neighbors and c4.5 decision tree. The performance is measured by Precision, Recall, F-measure, and Accuracy. The experimental results show that our approach has higher measurements than the k-means approach, except Precision where both the approaches have the same rate.
Flower cluster in longan production can be separated into two types, one with young leaf underneath and the other without. Each type of cluster requires different treatments of chemical fertilizer. The method to differentiate the type of longan flower clusters is challenging, as the color tone of a young leaf and fully mature one are extremely similar. This paper presents image processing techniques to distinguish the types of longan flower cluster, immature or mature cluster. Furthermore, if the immature flower cluster has been detected, the leaves beneath the cluster must then be classified whether most of them are young or fully-grown leaves. Since the colors of longan's leaves are very similar and they are all green, the appropriate color space must be considered. Even in the cases with or without the fully grown flower cluster, the color space must also be carefully selected due to its very close color of the young leaves. After conversion into some appropriate color space for each process of classification, changes in environmental brightness or illumination is another issue which must be carefully concerned. Finally, the correctly retrieved information then helps to determine the required chemical substance. Several image enhancement methods adaptively chosen according to the requirements for each classification process are thus applied to adjust the tone contrast in order to create high contrast tone of longan leaf needed to analyze the type of longan flower cluster. In this paper, we applied easy and well-known algorithms which are appropriate for our purposes in each processing step while still provide perfect classification results. In our experiment, we use 150 images to separate the two types of flower cluster, which yields an accuracy of 100%.Keywords-longan flower cluster; mature flower cluster; young leaf; image processing; image enhancement
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.