Selection bias in gene extraction on the basis of microarray gene-expression data

Ambroise, Christophe; McLachlan, Geoffrey J.

doi:10.1073/pnas.102102699

Cited by 1,227 publications

(949 citation statements)

References 25 publications

Supporting

Mentioning

923

Contrasting

Unclassified

Order By: Relevance

“…Thus, the features were ranked 10 times, each time independent of the hold-out samples for that given trial. This ensured that the estimated generalization performance is unbiased by spurious features that might explain the training class labels but might not generalize to the population (Ambroise and McLachlan, 2002).…”

Section: Machine Learning Methods and Analysismentioning

confidence: 99%

Machine learning classification of mesial temporal sclerosis in epilepsy patients

Rudie¹,

Colby²,

Salamon

2015

Epilepsy Research

View full text Add to dashboard Cite

a b s t r a c tBackground and purpose: Novel approaches applying machine-learning methods to neuroimaging data seek to develop individualized measures that will aid in the diagnosis and treatment of brain-based disorders such as temporal lobe epilepsy (TLE). Using a large cohort of epilepsy patients with and without mesial temporal sclerosis (MTS), we sought to automatically classify MTS using measures of cortical morphology, and to further relate classification probabilities to measures of disease burden. Materials and methods: Our sample consisted of high-resolution T1 structural scans of 169 adults with epilepsy collected across five different 1.5 T and four different 3 T scanners at UCLA. We applied a multiple support vector machine recursive feature elimination algorithm to morphological measures generated from FreeSurfer's automated segmentation and parcellation in order to classify Epilepsy patients with MTS (n = 85) from those without MTS (N = 84). Results: In addition to hippocampal volume, we found that alterations in cortical thickness, surface area, volume and curvature in inferior frontal and anterior and inferior temporal regions contributed to a classification accuracy of up to 81% (p = 1.3 × 10 −17 ) in identifying MTS. We also found that MTS classification probabilities were associated with a longer duration of disease for epilepsy patients both with and without MTS. Conclusions: In addition to implicating extra-hippocampal involvement of MTS, these findings shed further light on the pathogenesis of TLE and may ultimately assist in the development of automated tools that incorporate multiple neuroimaging measures to assist clinicians in detecting more subtle cases of TLE and MTS.

show abstract

Section: Machine Learning Methods and Analysismentioning

confidence: 99%

Machine learning classification of mesial temporal sclerosis in epilepsy patients

Rudie¹,

Colby²,

Salamon

2015

Epilepsy Research

View full text Add to dashboard Cite

show abstract

“…But, if it is calculated within the feature selection process, there is a selection bias in it when it is used as an estimate of the prediction error (Ambroise and McLachlan, 2002). External cross-validation should be undertaken subsequent to the feature selection process to correct for this selection bias.…”

Section: Principal Component Of Cortical Thickness As a Feature For Pmentioning

confidence: 99%

Pattern classification using principal components of cortical thickness and its discriminative pattern in schizophrenia

Yoon

Lee

et al. 2007

NeuroImage

View full text Add to dashboard Cite

We proposed pattern classification based on principal components of cortical thickness between schizophrenic patients and healthy controls, which was trained using a leave-one-out cross-validation. The cortical thickness was measured by calculating the Euclidean distance between linked vertices on the inner and outer cortical surfaces. Principal component analysis was applied to each lobe for practical computational issues and stability of principal components. And, discriminative patterns derived at every vertex in the original feature space with respect to support vector machine were analyzed with definitive findings of brain abnormalities in schizophrenia for establishing practical confidence. It was simulated with 50 randomly selected validation set for the generalization and the average accuracy of classification was reported. This study showed that some principal components might be more useful than others for classification, but not necessarily matching the ordering of the variance amounts they explained. In particular, 40-70 principal components rearranged by a simple two-sample t-test which ranked the effectiveness of features were used for the best mean accuracy of simulated classification (frontal: (left(%)|right(%)) = 91.07|88.80, parietal: 91.40|91.53, temporal: 93.60|91.47, occipital: 88.80|91.60). And, discriminative power appeared more spatially diffused bilaterally in the several regions, especially precentral, postcentral, superior frontal and temporal, cingulate and parahippocampal gyri. Since our results of discriminative patterns derived from classifier were consistent with a previous morphological analysis of schizophrenia, it can be said that the cortical thickness is a reliable feature for pattern classification and the potential benefits of such diagnostic tools are enhanced by our finding.

show abstract

“…However, these tools often present errors related to gene selection bias (Ambroise and McLachlan, 2002) that can lead to severe underestimations of prediction errors. We used (Medina et al, 2007), a tool unique in its kind, which implements a cross-validation strategy that renders unbiased error estimations.…”

Section: Identification Of a Molecular Prognosis Signature For Thyroimentioning

confidence: 99%

Molecular profiling related to poor prognosis in thyroid carcinoma. Combining gene expression data and biological information

et al. 2007

View full text Add to dashboard Cite

Undifferentiated and poorly differentiated thyroid tumors are responsible for more than half of thyroid cancer patient deaths in spite of their low incidence. Conventional treatments do not obtain substantial benefits, and the lack of alternative approaches limits patient survival. Additionally, the absence of prognostic markers for well-differentiated tumors complicates patient-specific treatments and favors the progression of recurrent forms. In order to recognize the molecular basis involved in tumor dedifferentiation and identify potential markers for thyroid cancer prognosis prediction, we analysed the expression profile of 44 thyroid primary tumors with different degrees of dedifferentiation and aggressiveness using cDNA microarrays. Transcriptome comparison of dedifferentiated and well-differentiated thyroid tumors identified 1031 genes with >2-fold difference in absolute values and false discovery rate of o0.15. According to known molecular interaction and reaction networks, the products of these genes were mainly clustered in the MAPkinase signaling pathway, the TGF-b signaling pathway, focal adhesion and cell motility, activation of actin polymerization and cell cycle. An exhaustive search in several databases allowed us to identify various members of the matrix metalloproteinase, melanoma antigen A and collagen gene families within the upregulated gene set. We also identified a prognosis classifier comprising just 30 transcripts with an overall accuracy of 95%. These findings may clarify the molecular mechanisms involved in thyroid tumor dedifferentiation and provide a potential prognosis predictor as well as targets for new therapies.

show abstract

Selection bias in gene extraction on the basis of microarray gene-expression data

Cited by 1,227 publications

References 25 publications

Machine learning classification of mesial temporal sclerosis in epilepsy patients

Machine learning classification of mesial temporal sclerosis in epilepsy patients

Pattern classification using principal components of cortical thickness and its discriminative pattern in schizophrenia

Molecular profiling related to poor prognosis in thyroid carcinoma. Combining gene expression data and biological information

Contact Info

Product

Resources

About