2021
DOI: 10.48550/arxiv.2111.02513
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluation of Tree Based Regression over Multiple Linear Regression for Non-normally Distributed Data in Battery Performance

Abstract: Battery performance datasets are typically non-normal and multicollinear. Extrapolating such datasets for model predictions needs attention to such characteristics. This study explores the impact of data normality in building machine learning models. In this work, tree-based regression models and multiple linear regressions models are each built from a highly skewed non-normal dataset with multicollinearity and compared. Several techniques are necessary, such as data transformation, to achieve a good multiple … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…Our correlation analysis of the pools shows that the SNPs cluster into five groups with minimal interdependency, aligning with the five major LD blocks in the sequenced region (Figure S2 ). To tackle multicollinearity (Chowdhury et al, 2021 ), we used nonparametric machine learning, specifically random forest and boosting techniques (Ogutu et al, 2011 ). From this, we identified high feature importance variants at both CDKN2A and rs4977756 sites (Figure S3 ).…”
mentioning
confidence: 99%
“…Our correlation analysis of the pools shows that the SNPs cluster into five groups with minimal interdependency, aligning with the five major LD blocks in the sequenced region (Figure S2 ). To tackle multicollinearity (Chowdhury et al, 2021 ), we used nonparametric machine learning, specifically random forest and boosting techniques (Ogutu et al, 2011 ). From this, we identified high feature importance variants at both CDKN2A and rs4977756 sites (Figure S3 ).…”
mentioning
confidence: 99%