2021
DOI: 10.1186/s12859-021-04049-z
|View full text |Cite
|
Sign up to set email alerts
|

R.ROSETTA: an interpretable machine learning framework

Abstract: Background Machine learning involves strategies and algorithms that may assist bioinformatics analyses in terms of data mining and knowledge discovery. In several applications, viz. in Life Sciences, it is often more important to understand how a prediction was obtained rather than knowing what prediction was made. To this end so-called interpretable machine learning has been recently advocated. In this study, we implemented an interpretable machine learning package based on the rough set theor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 58 publications
0
19
0
Order By: Relevance
“…After selecting the most important features from high-dimensional data, interpretable machine learning was performed by using the R.ROSETTA framework. 26 Subsequently, rule-based models were constructed, and copredictive features for each data set were estimated.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…After selecting the most important features from high-dimensional data, interpretable machine learning was performed by using the R.ROSETTA framework. 26 Subsequently, rule-based models were constructed, and copredictive features for each data set were estimated.…”
Section: Resultsmentioning
confidence: 99%
“…To create a ranking of the most informative features (ie, genes) that distinguish between diagnosis and relapse, the Monte Carlo Feature Selection (MCFS 25 ; rmcsf v.1.2.6) algorithm was applied. Subsequently, interpretable machine learning models were built by using the R.ROSETTA R-package 26 version 2.2.9. That algorithm is based on the rough set theory.…”
Section: Methodsmentioning
confidence: 99%
“…The initial rule-based model was built with R.ROSETTA 13 using data from 629 unique patient clinical visits (observations) and the discretised gene expression value for each DA1 and DA3 patient visit (features: 33,006 probes for 629 observations; Figure 1). This initial model had an overall prediction accuracy of 71% using 10-fold cross validation (Supplementary Fig S1 online).…”
Section: Resultsmentioning
confidence: 99%
“…First, expression values were subject to data discretisation, since R. ROSETTA 13 generates rules for that data form. For each gene, the control data expression mean (μ) and standard deviation (σ) were calculated, and then all DA data for that gene projected onto this threshold frame and discretised (Low ≤ μ - 2σ < Medium > High ≥ μ - 2σ; Numeric values 1, 2, 3).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation