The number of normal samples of wind turbine generators is much larger than the number of fault samples. To solve the problem of imbalanced classification in wind turbine generator fault detection, a cost-sensitive extremely randomized trees (CS-ERT) algorithm is proposed in this paper, in which the cost-sensitive learning method is introduced into an extremely randomized trees (ERT) algorithm. Based on the classification misclassification cost and class distribution, the misclassification cost gain (MCG) is proposed as the score measure of the CS-ERT model growth process to improve the classification accuracy of minority classes. The Hilbert-Schmidt independence criterion lasso (HSICLasso) feature selection method is used to select strongly correlated non-redundant features of doubly-fed wind turbine generators. The effectiveness of the method was verified by experiments on four different failure datasets of wind turbine generators. The experiment results show that average missing detection rate, average misclassification cost and gMean of the improved algorithm better than those of the ERT algorithm. In addition, compared with the CSForest, AdaCost and MetaCost methods, the proposed method has better real-time fault detection performance.
It is difficult to optimize the fault model parameters when Extreme Random Forest is used to detect the electric pitch system fault model of the double-fed wind turbine generator set. Therefore, Extreme Random Forest which was optimized by improved grey wolf algorithm (IGWO-ERF) was proposed to solve the problems mentioned above. First, IGWO-ERF imports the Cosine model to nonlinearize the linearly changing convergence factor α to balance the global exploration and local exploitation capabilities of the algorithm. Then, in the later stage of the algorithm iteration, α wolf generates its mirror wolf based on the lens imaging learning strategy to increase the diversity of the population and prevent local optimum of the population. The electric pitch system fault detection method of the wind turbine generator set sets the generator power of the variable pitch system as the main state parameter. First, it uses the Pearson correlation coefficient method to eliminate the features with low correlation with the electric pitch system generator power. Then, the remaining features are ranked by the importance of the RF features. Finally, the top N features are selected to construct the electric pitch system fault data set. The data set is divided into a training set and a test set. The training set is used to train the proposed fault detection model, and the test set is used for testing. Compared with other parameter optimization algorithms, the proposed method has lower FNR and FPR in the electric pitch system fault detection of the wind turbine generator set.
Aiming at the problem of unbalanced data categories of UHV converter valve fault data, a method for UHV converter valve fault detection based on optimization cost-sensitive extreme random forest is proposed. The misclassification cost gain is integrated into the extreme random forest decision tree as a splitting index, and the inertia weight and learning factor are improved to construct an improved particle swarm optimization algorithm. First, feature extraction and data cleaning are carried out to solve the problems of local data loss, large computational load, and low real-time performance of the model. Then, the classifier training based on the optimization cost-sensitive extreme random forest is used to construct a fault detection model, and the improved particle swarm optimization algorithm is used to output the optimal model parameters, achieving fast response of the model and high classification accuracy, good robustness, and generalization under unbalanced data. Finally, in order to verify its effectiveness, this model is compared with the existing optimization algorithms. The running speed is faster and the fault detection performance is higher, which can meet the actual needs.
The marine predator algorithm (MPA) is the latest metaheuristic algorithm proposed in 2020, which has an outstanding merit-seeking capability, but still has the disadvantage of slow convergence and is prone to a local optimum. To tackle the above problems, this paper proposed the flexible adaptive MPA. Based on the MPA, a flexible adaptive model is proposed and applied to each of the three stages of population iteration. By introducing nine benchmark test functions and changing their dimensions, the experimental results show that the flexible adaptive MPA has faster convergence speed, more accurate convergence ability, and excellent robustness. Finally, the flexible adaptive MPA is applied to feature selection experiments. The experimental results of 10 commonly used UCI high-dimensional datasets and three wind turbine (WT) fault datasets show that the flexible adaptive MPA can effectively extract the key features of high-dimensional datasets, reduce the data dimensionality, and improve the effectiveness of the machine algorithm for WT fault diagnosis (FD).
Aiming at the problem of class imbalance in the wind turbine blade bolts operation-monitoring dataset, a fault detection method for wind turbine blade bolts based on Gaussian Mixture Model–Synthetic Minority Oversampling Technique–Gaussian Mixture Model (GSG) combined with Cost-Sensitive LightGBM (CS-LightGBM) was proposed. Since it is difficult to obtain the fault samples of blade bolts, the GSG oversampling method was constructed to increase the fault samples in the blade bolt dataset. The method obtains the optimal number of clusters through the BIC criterion, and uses the GMM based on the optimal number of clusters to optimally cluster the fault samples in the blade bolt dataset. According to the density distribution of fault samples in inter-clusters, we synthesized new fault samples using SMOTE in an intra-cluster. This retains the distribution characteristics of the original fault class samples. Then, we used the GMM with the same initial cluster center to cluster the fault class samples that were added to new samples, and removed the synthetic fault class samples that were not clustered into the corresponding clusters. Finally, the synthetic data training set was used to train the CS-LightGBM fault detection model. Additionally, the hyperparameters of CS-LightGBM were optimized by the Bayesian optimization algorithm to obtain the optimal CS-LightGBM fault detection model. The experimental results show that compared with six models including SMOTE-LightGBM, CS-LightGBM, K-means-SMOTE-LightGBM, etc., the proposed fault detection model is superior to the other comparison methods in the false alarm rate, missing alarm rate and F1-score index. The method can well realize the fault detection of large wind turbine blade bolts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.