2019
DOI: 10.23919/cjee.2019.000025
|View full text |Cite
|
Sign up to set email alerts
|

A missing power data filling method based on improved random forest algorithm

Abstract: Missing data filling is a key step in power big data preprocessing, which helps to improve the quality and the utilization of electric power data. Due to the limitations of the traditional methods of filling missing data, an improved random forest filling algorithm is proposed. As a result of the horizontal and vertical directions of the electric power data are based on the characteristics of time series. Therefore, the method of improved random forest filling missing data combines the methods of linear interp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 73 publications
(41 citation statements)
references
References 7 publications
0
33
0
Order By: Relevance
“…2 Principle of filling random forest. The bootstrap resampling technique is firstly used where multiple samples are randomly selected from the original training dataset x to generate a new training dataset [ 32 ]. Then, multiple decision trees are built to form the random forest which then finally averages the output of each decision tree to determine the final filling result y [ 33 ].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…2 Principle of filling random forest. The bootstrap resampling technique is firstly used where multiple samples are randomly selected from the original training dataset x to generate a new training dataset [ 32 ]. Then, multiple decision trees are built to form the random forest which then finally averages the output of each decision tree to determine the final filling result y [ 33 ].…”
Section: Methodsmentioning
confidence: 99%
“…Unlike the bagging learning algorithm, where the models are made independently, gradient boosting makes its models sequentially by iteration to minimize the error of models learned earlier [34]. The gradient boosting algorithm learns a predictive model by combining M additive tree models (T 0 , T 1 , …, T n ) to predict the results as shown in the equation below: [32]. Then, multiple decision trees are built to form the random forest which then finally averages the output of each decision tree to determine the final filling result y [33].…”
Section: Gradient Boosting Algorithmmentioning
confidence: 99%
“…2) Random forest: Decision Trees present specific difficulties when generating the model, since creating a tree with many leaves can cause an over-fitting that may not be the most appropriate decision. Random trees are, therefore, used to achieve greater assertiveness [18]. Random trees use several trees averaging the final prediction of each tree.…”
Section: ) Decision Treementioning
confidence: 99%
“…The performance of the space-based model is highly dependent on the correlation between inputs and outputs. With the development of artificial intelligence (AI) technology, hundreds of space-based data imputation models have been established using AI-based methods, such as k-nearest neighbour (kNN) [15], random forest (RF) [16], cumulative linear regression (CLR) [17], and extreme learning machine (ELM) [18,19]. Compared with the time-based model, the space-based model ignores the correlation between the measured values at different times.…”
Section: Literature Reviewmentioning
confidence: 99%