Federated learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning, however, comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods, however, are only of limited utility in the federated learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions, such as i.i.d. distribution of the client data, which typically cannot be found in federated learning. In this article, we propose sparse ternary compression (STC), a new compression framework that is specifically designed to meet the requirements of the federated learning environment. STC extends the existing compression technique of top-k gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms federated averaging in common federated learning scenarios. These results advocate for a paradigm shift in federated optimization toward high-frequency low-bitwidth communication, in particular in the bandwidth-constrained learning environments.
Federated learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit its popularity, it has been observed that FL yields suboptimal results if the local clients' data distributions diverge. To address this issue, we present clustered FL (CFL), a novel federated multitask learning (FMTL) framework, which exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions. In contrast to existing FMTL approaches, CFL does not require any modifications to the FL communication protocol to be made, is applicable to general nonconvex objectives (in particular, deep neural networks), does not require the number of clusters to be known a priori, and comes with strong mathematical guarantees on the clustering quality. CFL is flexible enough to handle client populations that vary over time and can be implemented in a privacy-preserving way. As clustering is only performed after FL has converged to a stationary point, CFL can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models. We verify our theoretical analysis in experiments with deep convolutional and recurrent neural networks on commonly used FL data sets.
Currently, progressively larger deep neural networks are trained on ever growing data corpora. As this trend is only going to increase in the future, distributed training schemes are becoming increasingly relevant. A major issue in distributed training is the limited communication bandwidth between contributing nodes or prohibitive communication cost in general. These challenges become even more pressing, as the number of computation nodes increases. To counteract this development we propose sparse binary compression (SBC), a compression framework that allows for a drastic reduction of communication cost for distributed training. SBC combines existing techniques of communication delay and gradient sparsification with a novel binarization method and optimal weight update encoding to push compression gains to new limits. By doing so, our method also allows us to smoothly trade-off gradient sparsity and temporal sparsity to adapt to the requirements of the learning task. Our experiments show, that SBC can reduce the upstream communication on a variety of convolutional and recurrent neural network architectures by more than four orders of magnitude without significantly harming the convergence speed in terms of forward-backward passes. For instance, we can train ResNet50 on ImageNet in the same number of iterations to the baseline accuracy, using ×3531 less bits or train it to a 1% lower accuracy using ×37208 less bits. In the latter case, the total upstream communication required is cut from 125 terabytes to 3.35 gigabytes for every participating client.
Federated Learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning however comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods however are only of limited utility in the Federated Learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions such as iid distribution of the client data, which typically can not be found in Federated Learning. In this work, we propose Sparse Ternary Compression (STC), a new compression framework that is specifically designed to meet the requirements of the Federated Learning environment. STC extends the existing compression technique of top-k gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms Federated Averaging in common Federated Learning scenarios where clients either a) hold non-iid data, b) use small batch sizes during training, or where c) the number of clients is large and the participation rate in every communication round is low. We furthermore show that even if the clients hold iid data and use medium sized batches for training, STC still behaves paretosuperior to Federated Averaging in the sense that it achieves fixed target accuracies on our benchmarks within both fewer training iterations and a smaller communication budget. These results advocate for a paradigm shift in Federated optimization towards high-frequency low-bitwidth communication, in particular in bandwidth-constrained learning environments.
Diets of West African (WA) smallholder farmers are built on pearl millet [Pennisetum glaucum (L.) R. Br.]. Sustainable pearl millet hybrid breeding is challenging in WA, mostly due to an extensive genetic diversity combined with a high degree of admixture. In the absence of natural heterotic groups, understanding combining ability patterns can enable systematic development of heterotic groups and make sustainable hybrid breeding feasible. The objectives of this study were to evaluate heterosis and combining ability patterns and their relationship with genetic distance among WA pearl millets based on population hybrids, and to derive conclusions for future breeding programs. Therefore, 17 open‐pollinated varieties (OPVs) were crossed in a diallel mating design and tested together with their offspring in nine environments over 2 yr in Niger and Senegal. Genetic distances between the OPVs were evaluated with twenty microsatellite markers. Average panmictic better‐parent heterosis (PBPH) was 18% (1–47%) for panicle yield. A principal coordinate analysis based on genotyping results separated parental OPVs clearly by geographic origin. Although there was no relationship between genetic distance among OPVs and PBPH, we confirmed good combining ability among selected OPVs from Niger vs. Senegal. The identified cultivars (Nigerien CIVT, H80‐10Gr, and Taram and Senegalese Thialack 2 and Souna 3) with high combining ability are recommended for founding divergent heterotic pools targeting long‐panicle pearl millet hybrids. Our study shows the benefits of population hybrids and represents an important step to identify combining ability patterns and initial heterotic groups for WA pearl millet hybrid breeding.
Digital contact tracing approaches based on Bluetooth low energy (BLE) have the potential to efficiently contain and delay outbreaks of infectious diseases such as the ongoing SARS-CoV-2 pandemic. In this work we propose a machine learning based approach to reliably detect subjects that have spent enough time in close proximity to be at risk of being infected. Our study is an important proof of concept that will aid the battery of epidemiological policies aiming to slow down the rapid spread of COVID-19.
To promote the utilization of West and Central African (WCA) genetic resources of pearl millet [Pennisetum glaucum (L.) R. Br.], this study aimed at agro-morphological characterization of selected accessions from the pearl millet reference collection, established by the Generation Challenge Program and the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT). A total of 81 accessions were included, comprising 78 landraces originating from 13, predominantly WCA countries and three improved cultivars. All 81 accessions were evaluated together with 18 checks for resistance to the parasitic weed Striga hermonthica (Del.) Benth. in an artificially infested field at one location in Niger. Determined by available seed quantity, 74 accessions were characterized together with seven checks in the rainy season 2009 in field trials under low-input and fertilized conditions in Nigeria, Niger and Mali, respectively. Wide ranges were observed for various traits. Several accessions were identified as sources for specific traits of interest, i.e. long panicles, high-grain density, earliness, Striga resistance and stable yielding across environments. The observed yield inferiority of all Genebank accessions compared with checks may indicate lost adaptation or inbreeding depression due to an insufficient effective population size during multiplication. A principal component analysis revealed an immense diversity but also strong admixture among the tested accessions, i.e. there were no clearly distinct groups. The seed of all genotypes is available from ICRISAT. The online availability of the characterization data is expected to facilitate efficient use of these pearl millet accessions by breeding programmes in WCA and worldwide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.