Microarray technology has enabled us to simultaneously measure the expression of thousands of genes. Using this high-throughput technology, we can examine subtle genetic changes between biological samples and build predictive models for clinical applications. Although microarrays have dramatically increased the rate of data collection, sample size is still a major issue when selecting features. Previous methods show that combining multiple microarray datasets improves feature selection using simple methods such as fold change. We propose a wrapper-based gene selection technique that combines bootstrap estimated classification errors for individual genes across multiple datasets and reduces the contribution of datasets with high variance. We use the bootstrap because it is an unbiased estimator of classification error that is also effective for small sample data. Coupled with data combination across multiple datasets, we show that our metaanalytic approach improves the biological relevance of gene selection using prostate and renal cancer microarray data.
Microarray technology has enabled us to simultaneously measure the expression of thousands of genes. Using this high-throughput technology, we can examine subtle genetic changes between biological samples and build predictive models for clinical applications. Although microarrays have dramatically increased the rate of data collection, sample size is still a major issue when selecting features. Previous methods show that combining multiple microarray datasets improves feature selection using simple methods such as fold change. We propose a wrapper-based gene selection technique that combines bootstrap estimated classification errors for individual genes across multiple datasets and reduces the contribution of datasets with high variance. We use the bootstrap because it is an unbiased estimator of classification error that is also effective for small sample data. Coupled with data combination across multiple datasets, we show that our meta-analytic approach improves the biological relevance of gene selection using prostate and renal cancer microarray data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.