This paper introduces a new perspective on multi-class ensemble classification that considers training an ensemble as a state estimation problem. The new perspective considers the final ensemble classifier model as a static state, which can be estimated using a Kalman filter that combines noisy estimates made by individual classifier models. A new algorithm based on this perspective, the Kalman Filter-based Heuristic Ensemble (KFHE), is also presented in this paper which shows the practical applicability of the new perspective. Experiments performed on 30 datasets compare KFHE with state-of-the-art multi-class ensemble classification algorithms and show the potential and effectiveness of the new perspective and algorithm. Existing ensemble approaches trade off classification accuracy against robustness to class label noise, but KFHE is shown to be significantly better or at least as good as the state-of-the-art algorithms for datasets both with and without class label noise.
Multi-label classification is an approach which allows a datapoint to be labelled with more than one class at the same time. A common but trivial approach is to train individual binary classifiers per label, but the performance can be improved by considering associations within the labels. Like with any machine learning algorithm, hyperparameter tuning is important to train a good multi-label classifier model. The task of selecting the best hyperparameter settings for an algorithm is an optimisation problem. Very limited work has been done on automatic hyperparameter tuning and AutoML in the multi-label domain. This paper attempts to fill this gap by proposing a neural network algorithm, CascadeML, to train multi-label neural network based on cascade neural networks. This method requires minimal or no hyperparameter tuning and also considers pairwise label associations. The cascade algorithm grows the network architecture incrementally in a two phase process as it learns the weights using adaptive first order gradient algorithm, therefore omitting the requirement of preselecting the number of hidden layers, nodes and the learning rate. The method was tested on 10 multi-label datasets and compared with other multi-label classification algorithms. Results show that CascadeML performs very well without hyperparameter tuning.
Clustering algorithms have regained momentum with recent popularity of data mining and knowledge discovery approaches. To obtain good clustering in reasonable amount of time, various meta-heuristic approaches and their hybridization, sometimes with K-Means technique, have been employed. A Kalman Filtering based heuristic approach called Heuristic Kalman Algorithm (HKA) has been proposed a few years ago, which may be used for optimizing an objective function in data/feature space. In this paper at first HKA is employed in partitional data clustering. Then an improved approach named HKA-K is proposed, which combines the benefits of global exploration of HKA and the fast convergence of K-Means method. Implemented and tested on several datasets from UCI machine learning repository, the results obtained by HKA-K were compared with other hybrid meta-heuristic clustering approaches. It is shown that HKA-K is atleast as good as and often better than the other compared algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.