Short-term load forecasting (STLF) plays a vital role in the reliable, secure, and efficient operation of power systems. Since electric load variation results from diverse factors, accurate and stable load forecasting remains a challenging task. To increase the forecasting accuracy and stability, in this paper, we newly propose a short-term load forecasting method based on the cross multi-model and second decision mechanism. First, we combine horizontal and longitudinal training set selection method to construct the cross training sets, which acquire both the horizontal and longitudinal characteristics of the load variation. Second, to improve the generalization ability and extend the application scope, we construct forecasting multi-models by training multiple forecasting algorithms with cross training sets. Finally, to aggregate the forecasting outputs obtained by the forecasting multi-models, we propose a second decision mechanism based on a decision multi-model and adaptive weight allocation strategy, which overcomes the limited learning ability shortcoming of single decision models and further improves the forecasting accuracy. Case studies based on electrical load data from the state of Maine, the region of New England, Singapore, and New South Wales of Australia show that both the accuracy and the stability of the proposed method are superior to the compared models. INDEX TERMS Short-term load forecasting, multi-model, cross training set, second decision mechanism, model aggregation
BackgroundArtificial intelligence-based disease prediction models have a greater potential to screen COVID-19 patients than conventional methods. However, their application has been restricted because of their underlying black-box nature.ObjectiveTo addressed this issue, an explainable artificial intelligence (XAI) approach was developed to screen patients for COVID-19.MethodsA retrospective study consisting of 1,737 participants (759 COVID-19 patients and 978 controls) admitted to San Raphael Hospital (OSR) from February to May 2020 was used to construct a diagnosis model. Finally, 32 key blood test indices from 1,374 participants were used for screening patients for COVID-19. Four ensemble learning algorithms were used: random forest (RF), adaptive boosting (AdaBoost), gradient boosting decision tree (GBDT), and extreme gradient boosting (XGBoost). Feature importance from the perspective of the clinical domain and visualized interpretations were illustrated by using local interpretable model-agnostic explanations (LIME) plots.ResultsThe GBDT model [area under the curve (AUC): 86.4%; 95% confidence interval (CI) 0.821–0.907] outperformed the RF model (AUC: 85.7%; 95% CI 0.813–0.902), AdaBoost model (AUC: 85.4%; 95% CI 0.810–0.899), and XGBoost model (AUC: 84.9%; 95% CI 0.803–0.894) in distinguishing patients with COVID-19 from those without. The cumulative feature importance of lactate dehydrogenase, white blood cells, and eosinophil counts was 0.145, 0.130, and 0.128, respectively.ConclusionsEnsemble machining learning (ML) approaches, mainly GBDT and LIME plots, are efficient for screening patients with COVID-19 and might serve as a potential tool in the auxiliary diagnosis of COVID-19. Patients with higher WBC count, higher LDH level, or higher EOT count, were more likely to have COVID-19.
The collection and storage of large-scale load data in a smart grid provide new approaches for the efficient, economical, and safe operation of power systems.Deep Learning (DL) has become increasingly popular for large-scale load data analytics in recent years because of its ability to extract latent features and discovering complex relationships. This paper first overviews eight typical open load datasets of the grid and smart meter collected worldwide, the challenges faced by conventional machine learning, and the DL techniques applied to these challenges. A comprehensive review of the applications of DL techniques is then conducted from the perspective of analysis, forecast, management, and presented observation on each application. Critical points of DL models for improving performance are further discussed. In conclusion, several pressing problems of DL in load data analytics are identified, such as the accuracy gap between the actual and the expected, the generalization of hyperparameter setting, and the interpretation mechanism of DL output, which need special attention.
From the perspective of data science, we propose a cancer diagnosis method combining miRNA-lncRNA interaction pairs and class weight competition. First, miRNA-lncRNA interaction data is introduced into joint expression profiles, and the complex mechanism of cancer development is demonstrated in depth through the reappearance of key association information. This is an information ensemble of three carcinogenic mechanisms at dataset construction level: classical genetics, epigenetics, and the complex interaction effect between miRNAs and lncRNAs. Then, we put forward a hybrid feature selection algorithm. By preserving the interaction relationship between miRNAs and lncRNAs, it quickly and steadily removes irrelevant and redundant features and solves the high-dimensional disaster problem of cancer expression profiles. This is an information ensemble of multiple feature selection algorithms and the significant association relationship found between multi-dimensional features at feature selection level. A diversity sampling and multi-algorithm learners are used to construct a multiple heterogeneous classification models, which overcomes the small size of normal samples and the local optimum of single algorithm and single mode. This is an information ensemble of multiple classification model structures and multiple classification model state parameters at classification modeling level. At decision level, the proposed class weight which does not depend on the sample size is constructed to address the issue of unbalanced sample class of cancers. The ensemble of multi-category multi-state information at four levels (dataset construction, feature selection, classification modeling, and decision) constitutes the framework of the proposed method. We classify BRCA, LUAD and LUSC in TCGA. Compared with the state-of-the-art classification methods, the proposed method has improved classification accuracy by 9.25%∼21.25%, sensitivity by 6.45%∼66.45%, and specificity by 10.11%. In addition, we find that lincRNA instead of miRNA always appears in each group of feature genes, which provides a new clue for the locus target selection in cancer treatment. INDEX TERMS Cancer diagnosis, joint expression profiles, miRNA-lncRNA, feature selection embedded interaction pairs, class weight competition, locus target discovery.
Modeling an accurate forecasting model for short-term load is still challenging due to the diverse causes of load changing and lack of information on many of these causes. In this paper, error trend is used to reveal the trend effect caused by unknown load affecting factors and proposed adaptive second learning of error trend (A-SLET) to self-adapt the trend effect. Furthermore, the training set is classified based on balance point temperature and then parallelly trained and tested adaptive forecaster for hot days and adaptive forecaster for cold days with proper data. Combining A-SLET with parallel forecasting and training set classification, Adaptive and Parallel forecasting strategy based on Second Learning of Error Trend (AP-SLET) is proposed. The work studied two distinct load patterns, one in the USA and the other in Australia. Considering the yearly forecasting horizon, MAPE of the adaptive and parallel forecasting strategy is 1.87%-4.04% for ME-Maine of New England and 2.81%-4.41% for New South Wales. Compared to the state-of-art forecasting methods, MAPE of the adaptive and parallel forecasting strategy is reduced by 17.03%-33.33%, RMSE and MAE are reduced by 34.05% and 35.38% respectively. The experimental results demonstrate the proposed strategy can transform unknown and unavailable load affecting factors into known forecasting features and then adapt it to improve forecasting performance. The proposed strategy is also forecaster independent and equally applicable to almost all load scenarios regardless of geographical and seasonal differences. INDEX TERMS Adaptive and parallel forecasting, short-term load forecasting, smart grid, second learning of error trend, training set classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.