Pradipta Maji scite author profile

Abstract-In most pattern recognition algorithms, amino acids cannot be used directly as inputs since they are nonnumerical variables. They, therefore, need encoding prior to input. In this regard, bio-basis function maps a nonnumerical sequence space to a numerical feature space. It is designed using an amino acid mutation matrix. One of the important issues for the bio-basis function is how to select the minimum set of bio-bases with maximum information. In this paper, we describe an algorithm, termed as rough-fuzzy c-medoids (RFCMdd) algorithm, to select the most informative bio-bases. It is comprised of a judicious integration of the principles of rough sets, fuzzy sets, the c-medoids algorithm, and the amino acid mutation matrix. While the membership function of fuzzy sets enables efficient handling of overlapping partitions, the concept of lower and upper bounds of rough sets deals with uncertainty, vagueness, and incompleteness in class definition. The concept of crisp lower bound and fuzzy boundary of a class, introduced in RFCMdd, enables efficient selection of the minimum set of the most informative bio-bases. Some new indices are introduced for evaluating quantitatively the quality of selected bio-bases. The effectiveness of the proposed algorithm, along with a comparison with other algorithms, has been demonstrated on different types of protein data sets.

show abstract

Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data

Maji

2011

IEEE Trans. Syst., Man, Cybern. B

View full text Add to dashboard Cite

One of the major tasks with gene expression data is to find groups of coregulated genes whose collective expression is strongly associated with sample categories. In this regard, a new clustering algorithm, termed as fuzzy-rough supervised attribute clustering (FRSAC), is proposed to find such groups of genes. The proposed algorithm is based on the theory of fuzzy-rough sets, which directly incorporates the information of sample categories into the gene clustering process. A new quantitative measure is introduced based on fuzzy-rough sets that incorporates the information of sample categories to measure the similarity among genes. The proposed algorithm is based on measuring the similarity between genes using the new quantitative measure, whereby redundancy among the genes is removed. The clusters are refined incrementally based on sample categories. The effectiveness of the proposed FRSAC algorithm, along with a comparison with existing supervised and unsupervised gene selection and clustering algorithms, is demonstrated on six cancer and two arthritis data sets based on the class separability index and predictive accuracy of the naive Bayes' classifier, the K-nearest neighbor rule, and the support vector machine.

show abstract

A Rough Hypercuboid Approach for Feature Selection in Approximation Spaces

Maji

2014

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data

Maji

Pal

2010

IEEE Trans. Syst., Man, Cybern. B

View full text Add to dashboard Cite

Several information measures such as entropy, mutual information, and f-information have been shown to be successful for selecting a set of relevant and nonredundant genes from a high-dimensional microarray data set. However, for continuous gene expression values, it is very difficult to find the true density functions and to perform the integrations required to compute different information measures. In this regard, the concept of the fuzzy equivalence partition matrix is presented to approximate the true marginal and joint distributions of continuous gene expression values. The fuzzy equivalence partition matrix is based on the theory of fuzzy-rough sets, where each row of the matrix represents a fuzzy equivalence partition that can automatically be derived from the given expression values. The performance of the proposed approach is compared with that of existing approaches using the class separability index and the predictive accuracy of the support vector machine. An important finding, however, is that the proposed approach is shown to be effective for selecting relevant and nonredundant continuous-valued genes from microarray data.

show abstract

Generalized Multiple Attractor Cellular Automata (Gmaca) Model for Associative Memory

Ganguly

Maji

Sikdar

et al. 2002

Int. J. Patt. Recogn. Artif. Intell.

View full text Add to dashboard Cite

This paper reports an efficient technique of evolving Cellular Automata (CA) as an associative memory model. The evolved CA termed as GMACA (Generalized Multiple Attractor Cellular Automata), acts as a powerful pattern recognizer. Detailed analysis of GMACA rules establishes the fact that the rule subspace of the pattern recognizing CA lies at the edge of chaos — believed to be capable of executing complex computation.

show abstract

Content-based image retrieval using visually significant point features

Banerjee

Kundu

Maji

2009

Fuzzy Sets and Systems

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pradipta Maji

Rough Set Based Generalized Fuzzy $C$ -Means Algorithm and Quantitative Indices

Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data

Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis

Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data

A Rough Hypercuboid Approach for Feature Selection in Approximation Spaces

Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data

Generalized Multiple Attractor Cellular Automata (Gmaca) Model for Associative Memory

Content-based image retrieval using visually significant point features

Contact Info

Product

Resources

About