SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation

Verelst, Thomas; Tuytelaars, Tinne

doi:10.1109/tpami.2022.3162528

Cited by 9 publications

(1 citation statement)

References 65 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result, structured pruning frameworks are the preferred option when aiming to accelerate inference on general purpose hardware. A body of work across structured and unstructured pruning methods, attempts to induce structure in otherwise randomly sparse networks S. Gray & Kingma (2017); Ren et al (2018); Wen et al (2020); Verelst & Tuytelaars (2020). This is often referred to as block sparsity and consists in subdividing the matrix representations of inputs or weights into tiles (e.g.…”

Section: Related Workmentioning

confidence: 99%

ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity

Xinchi¹,

Fernández-Marqués²,

Gusmao³

et al. 2022

Preprint

View full text Add to dashboard Cite

When the available hardware cannot meet the memory and compute requirements to efficiently train high performing machine learning models, a compromise in either the training quality or the model complexity is needed. In Federated Learning (FL), nodes are orders of magnitude more constrained than traditional servergrade hardware and are often battery powered, severely limiting the sophistication of models that can be trained under this paradigm. While most research has focused on designing better aggregation strategies to improve convergence rates and in alleviating the communication costs of FL, fewer efforts have been devoted to accelerating on-device training. Such stage, which repeats hundreds of times (i.e. every round) and can involve thousands of devices, accounts for the majority of the time required to train federated models and, the totality of the energy consumption at the client side. In this work, we present the first study on the unique aspects that arise when introducing sparsity at training time in FL workloads. We then propose ZeroFL, a framework that relies on highly sparse operations to accelerate on-device training. Models trained with ZeroFL and 95% sparsity achieve up to 2.3% higher accuracy compared to competitive baselines obtained from adapting a state-of-the-art sparse training framework to the FL setting.

show abstract

Section: Related Workmentioning

confidence: 99%