Constantly improving gene expression technology offer the ability to measure the expression levels of thousand of genes in parallel. Gene expression data is expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. Key issue that needs to be addressed is the selection of small number of genes that contribute to a disease from the thousands of genes measured on microarrays that are inherently noisy. This work deals with finding a small subset of informative genes from gene expression microarray data which maximise the classification accuracy. This paper introduces a new algorithm of hybrid Genetic Algorithm and Support Vector Machine for genes selection and classification task. We show that the classification accuracy of the proposed algorithm is superior to a number of current state-of-the-art methods of two widely used benchmark datasets. The informative genes from the best subset are validated and verified by comparing them with the biological results produced from biologist and computer scientist researches in order to explore the biological plausibility.
A random forest method has been selected to perform both gene selection and classification of the microarray data. In this
embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest
classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest
subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist
researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed
better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates
through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to
lower prediction error rates compared to existing method and other similar available methods.
This paper introduces an improved Differential Evolution algorithm (IDE) which aims at improving its performance in estimating the relevant parameters for metabolic pathway data to simulate glycolysis pathway for yeast. Metabolic pathway data are expected to be of significant help in the development of efficient tools in kinetic modeling and parameter estimation platforms. Many computation algorithms face obstacles due to the noisy data and difficulty of the system in estimating myriad of parameters, and require longer computational time to estimate the relevant parameters. The proposed algorithm (IDE) in this paper is a hybrid of a Differential Evolution algorithm (DE) and a Kalman Filter (KF). The outcome of IDE is proven to be superior than Genetic Algorithm (GA) and DE. The results of IDE from experiments show estimated optimal kinetic parameters values, shorter computation time and increased accuracy for simulated results compared with other estimation algorithms
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.