Pneumonia is a respiratory infection caused by bacteria or viruses; it affects many individuals, especially in developing and underdeveloped nations, where high levels of pollution, unhygienic living conditions, and overcrowding are relatively common, together with inadequate medical infrastructure. Pneumonia causes pleural effusion, a condition in which fluids fill the lung, causing respiratory difficulty. Early diagnosis of pneumonia is crucial to ensure curative treatment and increase survival rates. Chest X-ray imaging is the most frequently used method for diagnosing pneumonia. However, the examination of chest X-rays is a challenging task and is prone to subjective variability. In this study, we developed a computer-aided diagnosis system for automatic pneumonia detection using chest X-ray images. We employed deep transfer learning to handle the scarcity of available data and designed an ensemble of three convolutional neural network models: GoogLeNet, ResNet-18, and DenseNet-121. A weighted average ensemble technique was adopted, wherein the weights assigned to the base learners were determined using a novel approach. The scores of four standard evaluation metrics, precision, recall, f1-score, and the area under the curve, are fused to form the weight vector, which in studies in the literature was frequently set experimentally, a method that is prone to error. The proposed approach was evaluated on two publicly available pneumonia X-ray datasets, provided by Kermany et al. and the Radiological Society of North America (RSNA), respectively, using a five-fold cross-validation scheme. The proposed method achieved accuracy rates of 98.81% and 86.85% and sensitivity rates of 98.80% and 87.02% on the Kermany and RSNA datasets, respectively. The results were superior to those of state-of-the-art methods and our method performed better than the widely used ensemble techniques. Statistical analyses on the datasets using McNemar’s and ANOVA tests showed the robustness of the approach. The codes for the proposed work are available at https://github.com/Rohit-Kundu/Ensemble-Pneumonia-Detection.
Microarray datasets play a crucial role in cancer detection. But the high dimension of these datasets makes the classification challenging due to the presence of many irrelevant and redundant features. Hence, feature selection becomes irreplaceable in this field because of its ability to remove the unrequired features from the system. As the task of selecting the optimal number of features is an NP-hard problem, hence, some meta-heuristic search technique helps to cope up with this problem. In this paper, we propose a 2-stage model for feature selection in microarray datasets. The ranking of the genes for the different filter methods are quite diverse and effectiveness of rankings is datasets dependent. First, we develop an ensemble of filter methods by considering the union and intersection of the top-n features of ReliefF, chi-square, and symmetrical uncertainty. This ensemble allows us to combine all the information of the three rankings together in a subset. In the next stage, we use genetic algorithm (GA) on the union and intersection to get the fine-tuned results, and union performs better than the latter. Our model has been shown to be classifier independent through the use of three classifiers-multi-layer perceptron (MLP), support vector machine (SVM), and K-nearest neighbor (K-NN). We have tested our model on five cancer datasets-colon, lung, leukemia, SRBCT, and prostate. Experimental results illustrate the superiority of our model in comparison to state-of-the-art methods. Graphical abstract ᅟ.
Feature selection (FS), an important pre-processing step in the fields of machine learning and data mining, has immense impact on the outcome of the corresponding learning models. Basically, it aims to remove all possible irrelevant as well as redundant features from a feature vector, thereby enhancing the performance of the overall prediction or classification model. Over the years, meta-heuristic optimization techniques have been applied for FS, as these are able to overcome the limitations of traditional optimization approaches. In this work, we introduce a binary variant of the recently-proposed Sailfish Optimizer (SFO), named as Binary Sailfish (BSF) optimizer, to solve FS problems. Sigmoid transfer function is utilized here to map the continuous search space of SFO to a binary one. In order to improve the exploitation ability of the BSF optimizer, we amalgamate another recently proposed meta-heuristic algorithm, namely adaptive β-hill climbing (AβHC) with BSF optimizer. The proposed BSF and AβBSF algorithms are applied on 18 standard UCI datasets and compared with 10 state-of-the-art meta-heuristic FS methods. The results demonstrate the superiority of both BSF and AβBSF algorithms in solving FS problems. The source code of this work is available in https://github.com/Rangerix/MetaheuristicOptimization. INDEX TERMS Binary sailfish optimizer, feature selection, adaptive β-hill climbing, hybrid optimization, UCI dataset.
Feature selection is a process to reduce the dimension of a dataset by removing redundant features, and to use the optimal subset of features for machine learning or data mining algorithms. This helps to minimize the time requirement to train a learning algorithm as well as to lessen the storage requirement by ignoring the less-informative features. Feature selection can be considered as a combinatorial optimization problem. In this paper, the authors have presented a new feature selection algorithm called Mayfly-Harmony Search (MA-HS) based on two meta-heuristics namely Mayfly Algorithm and Harmony Search. Mayfly Algorithm has not hitherto been used for feature selection problems to the best of the author's knowledge. An S-shaped transfer function is incorporated for converting it into a binary version of Mayfly Algorithm. When different candidate solutions obtained from various regions of the search space using Mayfly Algorithm are taken into the harmony memory and processed by Harmony Search, a superior solution can be ensured. This is the primary reason for proposing a hybrid of Mayfly Algorithm and Harmony Search. Thus, combining harmony search with Mayfly Algorithm leads to an increased exploitation of the search space and an overall improvement in the performance of Mayfly-Harmony Search (MA-HS) algorithm. The proposed algorithm has been applied on 18 UCI datasets and compared with 12 other state-of-the-art meta-heuristic FS methods. Experiments have also been performed on three high-dimensional microarray datasets. The results obtained support the superior performance of the algorithm compared to the other methods. The source code of the proposed algorithm can be found using the link as follows: https://github.com/trin07/MA-HS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.