Nowadays, smart meters are deployed in millions of residential households to gain significant insights from finegrained electricity consumption data. The information extracted from smart meter data enables utilities to identify the sociodemographic characteristics of electricity consumers and then offer them diversified services. Traditionally, this task is implemented in a centralized manner with the assumption that utilities have access to all the smart meter data. However, smart meter data are measured and owned by different retailers in the retail market who may not be willing to share their data. To this end, a distributed electricity consumer characteristics identification method is proposed based on federated learning, which can preserve the privacy of retailers. Specifically, privacyperseverance principal component analysis (PCA) is exploited to extract features from smart meter data. On this basis, an artificial neural network is trained in a federated manner with three weighted averaging strategies to bridge between smart meter data and the socio-demographic characteristics of consumers. Case studies on the Irish Commission for Energy Regulation (CER) dataset verify that the proposed federated method has comparable performance with the centralized model on both balanced and unbalanced datasets.
This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post‐mortem MDD cases (n = 126) and controls (n = 103). An average area‐under‐the‐receiver‐operating‐characteristics‐curve (AUC) from 10‐fold cross‐validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing‐set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood‐based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10‐fold cross‐validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing‐set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non‐smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.