Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

Gormley, Michael; Dampier, William; Ertel, Adam; Karaçali, B.; Tözeren, Aydın

doi:10.1186/1471-2105-8-415

Cited by 24 publications

(24 citation statements)

References 73 publications

(73 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, a recent study on microarray data related to breast cancer, renal tumors and lymphoma and including clinical information compared the prediction errors using different training sets. The results suggested that expression profiles established in this way showed little overlap [21]. We achieved significant prediction success using different gene sets established on different microarray platforms, and therefore we provide additional support to this finding.…”

supporting

confidence: 72%

Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients

Győrffy

Schäfer

2008

Breast Cancer Res Treat

View full text Add to dashboard Cite

The transcriptome of breast cancers have been extensively screened with microarrays and large sets of genes associated with clinical features have been established. The aim of this study was to validate original gene sets on a large cohort of raw breast cancer microarray data with known clinical follow-up. We recovered 20 publications and matched them to Affymetrix HGU133A annotations. Raw Affymetrix HGU133A microarray data were extracted from GEO and MAS5 normalized. For classifying patients using the selected gene sets, we applied prediction analysis of microarrays and constructed KaplanMeier plots. A new classification including all patients was generated using supervised principal components analysis. Seven studies including 1,470 patients were downloaded from GEO. Notably, we uncovered 641 microarrays representing 251 individual tumor specimens among them, which were repeatedly described under independent GEO identifiers. We excluded all redundant data and used the remaining 1,079 samples. Eight of the 20 gene sets were able to predict response at a significance of P \ 0.05. The discrimination of good and poor prognosis groups exclusively relying on gene expression data resulted in high significance (P = 1.8E-12). A model including genes fitted by both gene expression and clinical covariates (lymph node status and grade) contains 44 genes and can predict response at P = 9.5E-7. The outcome provides a ranking of the gene lists regarding applicability on an independent dataset. We established a consensus predictor combining the available clinical and gene expression data. The database comprising expression profiles of 1,079 breast cancers can be used to classify individual patients.

show abstract

supporting

confidence: 72%

Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients

Győrffy

Schäfer

2008

Breast Cancer Res Treat

View full text Add to dashboard Cite

show abstract

“…Usage of limited numbers of sensitive and specific biomarkers related to a certain disease in diagnosis (such as glucose in diabetes) is established from many years ago. Advances in biomarker discovery methods provided new opportunities to introduce efficient biomarkers or set of biomarkers (a panel) associated with the studied diseases [21]. Proteomics, genomics, metabolomics and other high throughput methods are used widely to identify new biomarkers.…”

Section: Discussionmentioning

confidence: 99%

Introducing Serine as Cardiovascular Disease Biomarker Candidate via Pathway Analysis

Tavirani

Azodi

Rostami‐Nejad

et al. 2020

GMJ

View full text Add to dashboard Cite

Background: The rate of death due to cardiovascular disease (CVD) is growing. Investigations about CVD that leading to introduce varieties of metabolites is available. The monitoring of these metabolites to find effective ones in the future of clinic applications is the main aim of this study. Materials and Methods: Numbers of 34 metabolites for the CVD are extracted from literature and designated for interaction determinations by MetScape V 3.1.3. The compound-reaction-enzyme-gene network was constructed and the pathways were analyzed. Based on the presence of metabolites in the pathways the critical compounds were determined. Results: Pathway analysis revealed 18 disturbed pathways related to the CVD. glycerophospholipid metabolism pathway including 27 compounds is related to the 9 queried metabolites. L-Serine which was communed between 5 pathways and also was presented in the largest pathway was identified as the critical compound. Conclusion: It can be concluded that L-Serine is a proper biomarker candidate for CVD diagnosis and also patients follow up approaches. [GMJ.2020;9:e1696]

show abstract

“…Usually microarray data class prediction problems are similar to class prediction problems in other areas, so most of the classic class prediction methods have been applied to microarray data analysis -LDA projects the samples into one dimension space to maximize the distance between classes and minimize the distance in each class at the same time [46]; k-NN predicts the test sample based on the class labels of the k samples near the test sample [37,45,46,58,60]; decision tree is a classifier in the form of a tree structure and each node either indicates the label of the test sample or specifies some test to select which sub-tree to go [67]; SVM aims to find an optimal hyper-plane to separate the two classes and maximize the distance between the hyper-plane and the closest data pointing to the hyperplane [24,47,[50][51][52]56,57]; and artificial neural networks take genes as input nodes and class label as the output node to learn the parameters connecting to nodes of different layers and predict the unknown samples [68][69][70]. Some modifications to the classic methods are also applied to this problem, such as Linder et al who proposed the subsequent artificial neural network (ANN) method, which has two levels of ANN.…”

Section: Class Predictionmentioning

confidence: 99%

“…where μ ki is mean value of gene i in class k, σ ki is standard deviation of gene i in class k [1,45,46], ratio between class sum of squares to within class sum of squares [46], correlation between gene expression G i and class label Y [42,47], entropy-based method [48] are applied in feature selection process. Because most of the filter methods only consider one gene at a time, they will fail to identify the combination effect of several genes.…”

Section: Dimension Reductionmentioning

confidence: 99%

Pattern recognition methods in microarray based oncology study

Lu¹,

Zhang

2009

Front. Electr. Electron. Eng. China

View full text Add to dashboard Cite

With the development of microarray technology, more and more microarray-based oncology studies have been carried out. Huge amounts of data and the complexity of cancer mechanisms make data analysis methods a much more important part of these studies. In this article, we will mainly focus on the pattern recognition methods used in oncology studies. According to the availability of sample information, the unsupervised methods and supervised methods are reviewed separately. Finally, some possible future directions are proposed.

show abstract

Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

Cited by 24 publications

References 73 publications

Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients

Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients

Introducing Serine as Cardiovascular Disease Biomarker Candidate via Pathway Analysis

Pattern recognition methods in microarray based oncology study

Contact Info

Product

Resources

About