Genotoxicity assessment in aquatic environment impacted by the presence of heavy metals

An important approach for unsupervised landcover classification in remote sensing images is the clustering of pixels in the spectral domain into several fuzzy partitions. In this paper, a multiobjective optimization algorithm is utilized to tackle the problem of fuzzy partitioning where a number of fuzzy cluster validity indexes are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of nondominated solutions, which the user can judge relatively and pick up the most promising one according to the problem requirements. Real-coded encoding of the cluster centers is used for this purpose.Results demonstrating the effectiveness of the proposed technique are provided for numeric remote sensing data described in terms of feature vectors. Different landcover regions in remote sensing imagery have also been classified using the proposed technique to establish its efficiency.Index Terms-Cluster validity measures, fuzzy clustering, genetic algorithm (GA), multiobjective optimization (MOO), Pareto-optimal, pixel classification, remote sensing imagery.

show abstract

A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I

Mukhopadhyay

Maulik

Bandyopadhyay

et al. 2014

IEEE Trans. Evol. Computat.

342

127

View full text Add to dashboard Cite

An improved algorithm for clustering gene expression data

2007

View full text Add to dashboard Cite

The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.

show abstract

Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II

Mukhopadhyay

Maulik

Bandyopadhyay

et al. 2014

IEEE Trans. Evol. Computat.

161

View full text Add to dashboard Cite

A Survey of Multiobjective Evolutionary Clustering

2015

View full text Add to dashboard Cite

Data clustering is a popular unsupervised data mining tool that is used for partitioning a given dataset into homogeneous groups based on some similarity/dissimilarity metric. Traditional clustering algorithms often make prior assumptions about the cluster structure and adopt a corresponding suitable objective function that is optimized either through classical techniques or metaheuristic approaches. These algorithms are known to perform poorly when the cluster assumptions do not hold in the data. Multiobjective clustering, in which multiple objective functions are simultaneously optimized, has emerged as an attractive and robust alternative in such situations. In particular, application of multiobjective evolutionary algorithms for clustering has become popular in the past decade because of their population-based nature. Here, we provide a comprehensive and critical survey of the multitude of multiobjective evolutionary clustering techniques existing in the literature. The techniques are classified according to the encoding strategies adopted, objective functions, evolutionary operators, strategy for maintaining nondominated solutions, and the method of selection of the final solution. The pros and cons of the different approaches are mentioned. Finally, we have discussed some real-life applications of multiobjective clustering in the domains of image segmentation, bioinformatics, web mining, and so forth.

show abstract

A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

2012

View full text Add to dashboard Cite

Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed.

show abstract

A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data

Bandyopadhyay

Mallik

Mukhopadhyay

2014

IEEE/ACM Trans. Comput. Biol. and Bioinf.

105

View full text Add to dashboard Cite

DNA microarray is a powerful technology that can simultaneously determine the levels of thousands of transcripts (generated, for example, from genes/miRNAs) across different experimental conditions or tissue samples. The motto of differential expression analysis is to identify the transcripts whose expressions change significantly across different types of samples or experimental conditions. A number of statistical testing methods are available for this purpose. In this paper, we provide a comprehensive survey on different parametric and non-parametric testing methodologies for identifying differential expression from microarray data sets. The performances of the different testing methods have been compared based on some real-life miRNA and mRNA expression data sets. For validating the resulting differentially expressed miRNAs, the outcomes of each test are checked with the information available for miRNA in the standard miRNA database PhenomiR 2.0. Subsequently, we have prepared different simulated data sets of different sample sizes (from 10 to 100 per group/population) and thereafter the power of each test have been calculated individually. The comparative simulated study might lead to formulate robust and comprehensive judgements about the performance of each test in the basis of assumption of data distribution. Finally, a list of advantages and limitations of the different statistical tests has been provided, along with indications of some areas where further studies are required.

show abstract

Machine learning techniques for sequence-based prediction of viral–host interactions between SARS-CoV-2 and human proteins

2020

View full text Add to dashboard Cite

Background COVID-19 (Coronavirus Disease-19), a disease caused by the SARS-CoV-2 virus, has been declared as a pandemic by the World Health Organization on March 11, 2020. Over 15 million people have already been affected worldwide by COVID-19, resulting in more than 0.6 million deaths. Protein–protein interactions (PPIs) play a key role in the cellular process of SARS-CoV-2 virus infection in the human body. Recently a study has reported some SARS-CoV-2 proteins that interact with several human proteins while many potential interactions remain to be identified. Method In this article, various machine learning models are built to predict the PPIs between the virus and human proteins that are further validated using biological experiments. The classification models are prepared based on different sequence-based features of human proteins like amino acid composition, pseudo amino acid composition, and conjoint triad. Result We have built an ensemble voting classifier using SVM Radial , SVM Polynomial , and Random Forest technique that gives a greater accuracy, precision, specificity, recall, and F1 score compared to all other models used in the work. A total of 1326 potential human target proteins of SARS-CoV-2 have been predicted by the proposed ensemble model and validated using gene ontology and KEGG pathway enrichment analysis. Several repurposable drugs targeting the predicted interactions are also reported. Conclusion This study may encourage the identification of potential targets for more effective anti-COVID drug discovery.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anirban Mukhopadhyay

Multiobjective Genetic Clustering for Pixel Classification in Remote Sensing Imagery

A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I

An improved algorithm for clustering gene expression data

Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II

A Survey of Multiobjective Evolutionary Clustering

A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data

Machine learning techniques for sequence-based prediction of viral–host interactions between SARS-CoV-2 and human proteins

Contact Info

Product

Resources

About