2022
DOI: 10.26502/jbsb.5107040
|View full text |Cite
|
Sign up to set email alerts
|

Joint Secondary Transcriptomic Analysis of Non-Hodgkin’s B-Cell Lymphomas Predicts Reliance on Pathways Associated with the Extracellular Matrix and Robust Diagnostic Biomarkers

Abstract: Approximately 450,000 cases of Non-Hodgkin’s lymphoma are annually diagnosed worldwide, resulting in ~240,000 deaths. An augmented understanding of the common mechanisms of pathology among larger numbers of B-cell Non-Hodgkin’s Lymphoma (BCNHL) patients is sorely needed. We consequently performed a large joint secondary transcriptomic analysis of the available BCNHL RNA-sequencing projects from GEO, consisting of 322 relevant samples across ten distinct public studies, to find common underlying mechanisms and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 92 publications
(95 reference statements)
0
3
0
Order By: Relevance
“…A random forest classification method (using R randomforest version 4.6–14) was then used to classify the severity phenotype of each sample [51] . The hyperparameters for the random forest model were 10,000 decision trees per forest, gini index as impurity criterion, and the square root of the number of features (genes in this case) to use for each split in the decision tree, as described previously [52] , [53] .…”
Section: Methodsmentioning
confidence: 99%
“…A random forest classification method (using R randomforest version 4.6–14) was then used to classify the severity phenotype of each sample [51] . The hyperparameters for the random forest model were 10,000 decision trees per forest, gini index as impurity criterion, and the square root of the number of features (genes in this case) to use for each split in the decision tree, as described previously [52] , [53] .…”
Section: Methodsmentioning
confidence: 99%
“…Similarly, the enrichr pathway enrichment software only required a gene symbol, log 2 fold-change values, and FDR-adjusted p -values as input from the DEG list. The statistically significant differentially expressed genes (DEGs; FDR-corrected p -value < 0.05) were then subjected to signaling pathway analysis using the Signal Pathway Impact Analysis (SPIA) algorithm with 3,000 bootstrap replicates to generate a null distribution for each of over 2,000 public signaling pathways ( Tarca et al, 2009 ), as reported previously ( Scott, Jensen & Pickett, 2021 ; Gray et al, 2022 ; Moreno et al, 2022 ; Rapier-Sharman, Clancy & Pickett, 2022 ; Ferrarini et al, 2021 ; Scott et al, 2022 ; Gifford & Pickett, 2022 ). The lists of pathways were derived from publicly available versions of KEGG ( Aoki-Kinoshita & Kanehisa, 2007 ), Reactome ( Jassal et al, 2020 ), Pathway Interaction Database ( Schaefer et al, 2009 ), BioCarta, and Panther ( Mi et al, 2017 ).…”
Section: Methodsmentioning
confidence: 99%
“…This updated algorithm retrieves additional target information, clinical trial data, automatically fetches the reactome pathway diagrams for the signaling pathways with the highest number of targets, and accepts reactome pathway enrichments generated by the enrichr algorithm ( Xie et al, 2021a ). This additional data and prioritization method are used by the updated algorithm to generate ranked lists of targets and therapeutics that can be applicable to multiple use cases ( Scott, Jensen & Pickett, 2021 ; Gray et al, 2022 ; Moreno et al, 2022 ; Rapier-Sharman, Clancy & Pickett, 2022 ). The entities in these lists can then be evaluated as candidates for condition-specific repurposing efforts based solely on the unique signaling pathway “profile” for the disease/condition of interest.…”
Section: Introductionmentioning
confidence: 99%
“…Separately, the Salmon read counts for each sample were combined into a tabular format and labeled as "resistant" or "sensitive" to treatment with Letrozole. This table was then used as input to the XGboost algorithm [39], which uses a tree-based method to train a model from 80% of the dataset and then quantifies its performance using the remaining 20% of the data [40,41], which minimizes model overfit. For the initial analysis, the gain metric was calculated from the read counts for all detected genes across all samples.…”
Section: Target and Biomarker Predictionmentioning
confidence: 99%
“…Given the gain metrics from the whole transcriptome, the number of genes/features being evaluated was reduced to the best two biomarkers from the original analysis since this is a number that is easily accommodated by qRT-PCR (or similar) molecular methods. This approach has been successfully applied previously with acceptable performance and accuracy [40,41,123]. The XGboost algorithm was selected since prior work has shown that tree-based classifiers are faster and more accurate than other machine learning-based methods such as support vector machine, neural networks, and Bayesian approaches [126].…”
Section: Target and Biomarker Predictionmentioning
confidence: 99%