Akhtar Khalil scite author profile

Feature selection is critical in reducing the size of data and improving classifier accuracy by selecting an optimum subset of the overall features. Traditionally, each feature is given a score against a particular category (such as using Mutual Information) and the task of feature selection comes down to choosing the top k ranked features with the best average score across all categories. However, this approach has two major drawbacks. Firstly, the maximum or average score of a feature with a class might not necessarily determine its discriminating strength among samples of other classes. Secondly, most feature selection methods only use the scores to select the discriminating features from the corpus without taking into account the redundancy of information provided by the selected features. In this paper, we propose a new feature ranking score measure called the Discriminative Mutual Information (DMI) score. This score helps to select features that distinguish samples of one category against all other categories. Moreover, Non-Redundant Feature Selection (NRFS) heuristic is also proposed that explicitly takes the problem of feature redundancy into account when selecting the features set. The performance of our approach is investigated and compared with other feature selection techniques on datasets derived from high-dimensional text corpora using multiple classification algorithms. The results show that the proposed method leads to better classification micro-F1 score as compared to other state-of-the-art methods. In particular, the proposed method shows great improvement when the number of selected features are small as well as an overall higher robustness to label noise.

show abstract

Restoration and content analysis of ancient manuscripts via color space based segmentation

Hanif

Tonazzini

Hussain

et al. 2023

PLoS ONE

View full text Add to dashboard Cite

Ancient manuscripts are a rich source of history and civilization. Unfortunately, these documents are often affected by different age and storage related degradation which impinge on their readability and information contents. In this paper, we propose a document restoration method that removes the unwanted interfering degradation patterns from color ancient manuscripts. We exploit different color spaces to highlight the spectral differences in various layers of information usually present in these documents. At each image pixel, the spectral representations of all color spaces are stacked to form a feature vector. PCA is applied to the whole data cube to eliminate correlation of the color planes and enhance separation among the patterns. The reduced data cube, along with the pixel spatial information, is used to perform a pixel based segmentation, where each cluster represents a class of pixels that share similar color properties in the decorrelated color spaces. The interfering, unwanted classes can thus be removed by inpainting their pixels with the background texture. Assuming Gaussian distributions for the various classes, a Gaussian Mixture Model (GMM) is estimated through the Expectation Maximization (EM) algorithm from the data, and then used to find appropriate labels for each pixel. In order to preserve the original appearance of the document and reproduce the background texture, the detected degraded pixels are replaced based on Gaussian conditional simulation, according to the surrounding context. Experiments are shown on manuscripts affected by different kinds of degradations, including manuscripts from the DIBCO 2018 and 2019 publicaly available dataset. We observe that the use of a few PCA dominant components accelerates the clustering process and provides a more accurate segmentation.

show abstract

Adaptation of ann for FPGA implementation and its application for speaker identification

Elmisery

Khalil²,

Salama³

et al.

View full text Add to dashboard Cite

Saliency based skin detection in complex scenes

Ahmad

Khan

et al. 2013

View full text Add to dashboard Cite

Background cluttering badly affects the performance of Skin detection. In highly cluttered images, skin detection becomes more difficult and the algorithm can't differentiate between the skin and non-skin pixels. In this paper, we introduce saliency algorithm for removing the irrelevant information especially the skin like regions, in the background of the human images to tackle the background cluttering problem and improve the performance of skin detection algorithms in images with complex backgrounds. Extensive experimentation on highly cluttered and complex images shows that saliency algorithm further enhances the performance of skin detection algorithms not only in terms of false positive rate but in true positive rate, true negative, false negative rate, accuracy and precision too.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Akhtar Khalil

A FPGA-based HMM for a discrete Arabic speech recognition system

A Fast Non-Redundant Feature Selection Technique for Text Data

Restoration and content analysis of ancient manuscripts via color space based segmentation

Adaptation of ann for FPGA implementation and its application for speaker identification

Saliency based skin detection in complex scenes

Contact Info

Product

Resources

About