The storage capabilities and advanced in data collection has led to an information load and the size of databases increases in dimensions, not only in rows but also in columns. Data reduction (DR) plays a vital role as a data prepossessing techniques in the area of knowledge discovery from the huge collection of data. Feature selection (FS) is one of the well known data reduction techniques, which deals with the reduction of attributes from the original data without affecting the main information content. Based on the training data used for different applications of knowledge discovery, FS technique falls into supervised, unsupervised. In this paper an extensive survey on supervised FS technique describing the different searching approach, methods and application areas with an outline of a comparative study is covered.
The COVID-19 pandemic is causing a global health crisis. Public spaces need to be safeguarded from the adverse effects of this pandemic. Wearing a facemask has become an adequate protection solution many governments adopt. Manual real-time monitoring of face mask wearing for many people is becoming a difficult task. This paper applies three heterogeneous deep transfer learning models, viz., ResNet50, Inception-v3, and VGG-16, to prepare an ensemble classification model for detecting whether a person is wearing a mask. The ensemble classification model is underlined by the concept of the weighted average technique. The proposed framework is based on two phases. An off-line phase that aims to prepare a classification model by following training-testing steps to detect and locate facemasks. Then in the second online phase, it is deployed to detect real-time faces from live videos, which are captured by a web-camera. The prepared model is compared with several state-of-the-art models. The proposed model has achieved the highest classification accuracy of 99.97%, precision of 0.997, recall of 0.997, F1-score of 0.997 and kappa coefficient 0.994. The superiority of the model over state-of-the-art compared methods is well evident from the experimental results.
Selection of useful information from a large data collection is an important and challenging problem. Feature selection refers to the problem of selecting relevant features from a given dataset which produces the most predictive outcome as the original features maintain before the selection. Rough set theory (RST) and its extension are the most successful mathematical tools for feature selection from a given dataset. This paper starts with an outline of the fundamental concepts behind the rough set and fuzzy rough set based feature grouping techniques which are related to supervise feature selection. Supervised Quickreduct (QR) and fuzzyrough feature grouping Quickreduct (FQR) algorithms are highlighted here. Then an enhanced version of FQR method is proposed here which is based on rough set dependency criteria with feature significance measure that select a minimal subset of features. Also, the termination condition of the base method is modified. Experimental studies of the algorithms are carried out on five public domain benchmark datasets available in UCI machine learning repository. JRip and J48 classifier are used to measure the classification accuracy. The performance of the proposed method is found to be satisfactory in comparison with other methods.
In the last few years, ensemble learning has received more interest primarily for the task of classification. It is based on the postulation that combining the output of multiple experts is better than the output of any individual expert. Ensemble feature selection may improve the performance of the learning algorithms and has the ability to obtain more stable and robust results. However, during the process of feature aggregation and selection, selected feature subset may contain high levels of inter-feature redundancy. To address this issue, a novel algorithm based on feature rank aggregation and graph theoretic technique for ensemble feature selection (R-GEFS) with the fusion of Pearson and Spearman correlation metrics is proposed. The method works by aggregation of the profile of preferences of five feature rankers as the base feature selectors. Then similar features are grouped into clusters using graph theoretic approach. The most representative feature strongly co-related to target decision classes is drawn from each cluster. The efficiency and effectiveness of the R-GEFS algorithm are evaluated through an empirical study. Extensive experiments on 15 diverse benchmark datasets are carried out to compare R-GEFS with seven state-of-the-art feature selection models with respect to four popular classifiers, namely decision tree, [Formula: see text] nearest neighbor, random forest, and support vector machine. The proposed method turns out to be effective by selecting smaller feature subsets with lesser computational complexities and it assists in increasing the classification accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.