2023
DOI: 10.3390/bioengineering10020231
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations

Abstract: The microbiota has proved to be one of the critical factors for many diseases, and researchers have been using microbiome data for disease prediction. However, models trained on one independent microbiome study may not be easily applicable to other independent studies due to the high level of variability in microbiome data. In this study, we developed a method for improving the generalizability and interpretability of machine learning models for predicting three different diseases (colorectal cancer, Crohn’s d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 66 publications
0
2
0
Order By: Relevance
“…Additionally, in order to be able to model the combined data from the TCGA dataset and the data from our study, a novel approach to data merging (and normalisation) was utilised, building upon previous ideas on combining different datasets [25,28]. As the results are promising, a further study to identify whether the method could be generalised might be interesting.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, in order to be able to model the combined data from the TCGA dataset and the data from our study, a novel approach to data merging (and normalisation) was utilised, building upon previous ideas on combining different datasets [25,28]. As the results are promising, a further study to identify whether the method could be generalised might be interesting.…”
Section: Discussionmentioning
confidence: 99%
“…Combining a generally available dataset with a part of the target dataset to increase dataset size and reduce model overfitting has been described previously [25], however, with directly mergeable data. Since TCGA and study data were measured using a different approach, a normalisation process needed to be devised to allow the data to be merged.…”
Section: Merging and Normalisationmentioning
confidence: 99%