Bioinformatics Analysis Identifying Key Biomarkers in Bladder Cancer

Zhang, Chuan; Berndt‐Paetz, Mandy; Neuhaus, Jochen

doi:10.3390/data5020038

Cited by 4 publications

(3 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data was organized by [8] and released under creative commons license. Authors of the data-set provided N = 406 anonymized clinical samples containing gene expression values of 14 hub genes related to bladder cancer.…”

Section: Methodsmentioning

confidence: 99%

“…Nonetheless, these groups do not match perfectly the hub and seed sets, suggesting that genes among these two groups do not contribute equally to the informative content of the data-set. In the original paper [8], Dr. Zhang described an opposite behavior of CRYAB, TPM1, and CASQ2 genes compared to other hub genes. The negative correlation between these three genes and the other hub genes appears on the dendrogram, with their inclusion in the seed group.…”

Section: Original Data Assessmentmentioning

confidence: 99%

“…The negative correlation between these three genes and the other hub genes appears on the dendrogram, with their inclusion in the seed group. Moreover, gene expression data was pre-selected by medical doctors that authored the original data-set [8], and feature selection techniques based on variance or correlation may not consider all the intuitions underlying their research. To preserve all the knowledge present in the data-set and reduce the feature space removing redundant information, dimensionality reduction was usually preferred to feature selection because it creates new synthetic features by combining the original ones.…”

Section: Original Data Assessmentmentioning

confidence: 99%

See 2 more Smart Citations

Double-stage discretization approaches for biomarker-based bladder cancer survival modeling

Nascimben

Manolo

Rimondini

2021

Communications in Applied and Industrial Mathematics

View full text Add to dashboard Cite

Bioinformatic techniques targeting gene expression data require specific analysis pipelines with the aim of studying properties, adaptation, and disease outcomes in a sample population. Present investigation compared together results of four numerical experiments modeling survival rates from bladder cancer genetic profiles. Research showed that a sequence of two discretization phases produced remarkable results compared to a classic approach employing one discretization of gene expression data. Analysis involving two discretization phases consisted of a primary discretizer followed by refinement or pre-binning input values before the main discretization scheme. Among all tests, the best model encloses a sequence of data transformation to compensate skewness, data discretization phase with class-attribute interdependence maximization algorithm, and final classification by voting feature intervals, a classifier that also provides discrete interval optimization.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Original Data Assessmentmentioning

confidence: 99%

Section: Original Data Assessmentmentioning

confidence: 99%

See 1 more Smart Citation

Double-stage discretization approaches for biomarker-based bladder cancer survival modeling

Nascimben

Manolo

Rimondini

2021

Communications in Applied and Industrial Mathematics

View full text Add to dashboard Cite

show abstract

Identification of key markers for the stages of nonalcoholic fatty liver disease: An integrated bioinformatics analysis and experimental validation

Reyes-Avendaño,

Villaseñor-Altamirano,

Reyes-Jimenez

et al. 2024

Digestive and Liver Disease

View full text Add to dashboard Cite

Polygenic risk modeling of tumor stage and survival in bladder cancer

et al. 2022

View full text Add to dashboard Cite

Introduction Bladder cancer assessment with non-invasive gene expression signatures facilitates the detection of patients at risk and surveillance of their status, bypassing the discomforts given by cystoscopy. To achieve accurate cancer estimation, analysis pipelines for gene expression data (GED) may integrate a sequence of several machine learning and bio-statistical techniques to model complex characteristics of pathological patterns. Methods Numerical experiments tested the combination of GED preprocessing by discretization with tree ensemble embeddings and nonlinear dimensionality reductions to categorize oncological patients comprehensively. Modeling aimed to identify tumor stage and distinguish survival outcomes in two situations: complete and partial data embedding. This latter experimental condition simulates the addition of new patients to an existing model for rapid monitoring of disease progression. Machine learning procedures were employed to identify the most relevant genes involved in patient prognosis and test the performance of preprocessed GED compared to untransformed data in predicting patient conditions. Results Data embedding paired with dimensionality reduction produced prognostic maps with well-defined clusters of patients, suitable for medical decision support. A second experiment simulated the addition of new patients to an existing model (partial data embedding): Uniform Manifold Approximation and Projection (UMAP) methodology with uniform data discretization led to better outcomes than other analyzed pipelines. Further exploration of parameter space for UMAP and t-distributed stochastic neighbor embedding (t-SNE) underlined the importance of tuning a higher number of parameters for UMAP rather than t-SNE. Moreover, two different machine learning experiments identified a group of genes valuable for partitioning patients (gene relevance analysis) and showed the higher precision obtained by preprocessed data in predicting tumor outcomes for cancer stage and survival rate (six classes prediction). Conclusions The present investigation proposed new analysis pipelines for disease outcome modeling from bladder cancer-related biomarkers. Complete and partial data embedding experiments suggested that pipelines employing UMAP had a more accurate predictive ability, supporting the recent literature trends on this methodology. However, it was also found that several UMAP parameters influence experimental results, therefore deriving a recommendation for researchers to pay attention to this aspect of the UMAP technique. Machine learning procedures further demonstrated the effectiveness of the proposed preprocessing in predicting patients’ conditions and determined a sub-group of biomarkers significant for forecasting bladder cancer prognosis.

show abstract

Bioinformatics Analysis Identifying Key Biomarkers in Bladder Cancer

Cited by 4 publications

References 42 publications

Double-stage discretization approaches for biomarker-based bladder cancer survival modeling

Double-stage discretization approaches for biomarker-based bladder cancer survival modeling

Identification of key markers for the stages of nonalcoholic fatty liver disease: An integrated bioinformatics analysis and experimental validation

Polygenic risk modeling of tumor stage and survival in bladder cancer

Contact Info

Product

Resources

About