In cloud computing systems slow processing nodes, often referred to as "stragglers", can significantly extend the computation time. Recent results have shown that error correction coding can be used to reduce the effect of stragglers. In this work we introduce a scheme that, in addition to using error correction to distribute mixed jobs across nodes, is also able to exploit the work completed by all nodes, including stragglers. We first consider vector-matrix multiplication and apply maximum distance separable (MDS) codes to small blocks of sub-matrices. The worker nodes process blocks sequentially, working block-by-block, transmitting partial per-block results to the master as they are completed. Sub-blocking allows a more continuous completion process, which thereby allows us to exploit the work of a much broader spectrum of processors and reduces computation time. We then apply this technique to matrix-matrix multiplication using product code. In this case, we show that the order of computing sub-tasks is a new degree of design freedom that can be exploited to reduce computation time further. We propose a novel approach to analyze the finishing time, which is different from typical order statistics. Simulation results show that the expected computation time decreases by a factor of at least two in compared to previous methods.
Abstract-This paper considers the problem of casual heart rate tracking during intensive physical exercise using simultaneous 2 channel photoplethysmographic (PPG) and 3 dimensional (3D) acceleration signals recorded from wrist. This is a challenging problem because the PPG signals recorded from wrist during exercise are contaminated by strong Motion Artifacts (MAs). In this work, a novel algorithm is proposed which consists of two main steps of MA Cancellation and Spectral Analysis. The MA cancellation step cleanses the MA-contaminated PPG signals utilizing the acceleration data and the spectral analysis step estimates a higher resolution spectrum of the signal and selects the spectral peaks corresponding to HR. Experimental results on datasets recorded from 12 subjects during fast running at the peak speed of 15 km/hour showed that the proposed algorithm achieves an average absolute error of 1.25 beat per minute (BPM). These experimental results also confirm that the proposed algorithm keeps high estimation accuracies even in strong MA conditions.
In distributed computing systems slow working nodes, known as stragglers, can greatly extend finishing times. Coded computing is a technique that enables straggler-resistant computation. Most coded computing techniques presented to date provide robustness by ensuring that the time to finish depends only on a set of the fastest nodes. However, while stragglers do compute less work than non-stragglers, in real-world commercial cloud computing systems (e.g., Amazon's Elastic Compute Cloud (EC2)) the distinction is often a soft one. In this paper, we develop hierarchical coded computing that exploits the work completed by all nodes, both fast and slow, automatically integrating the potential contribution of each. We first present a conceptual framework to represent the division of work amongst nodes in coded matrix multiplication as a cuboid partitioning problem. This framework allows us to unify existing methods and motivates new techniques. We then develop three methods of hierarchical coded computing that we term bit-interleaved coded computation (BICC), multilevel coded computation (MLCC), and hybrid hierarchical coded computation (HHCC). In this paradigm, each worker is tasked with completing a sequence (a hierarchy) of ordered subtasks. The sequence of subtasks, and the complexity of each, is designed so that partial work completed by stragglers can be used in, rather than ignored. We note that our methods can be used in conjunction with any coded computing method. We illustrate this showing how we can use our methods to accelerate all previously developed coded computing technique by enabling them to exploit stragglers. Under a widely studied statistical model of completion time, our approach realizes a 66% improvement in expected finishing time. On Amazon EC2, the gain was 28% when stragglers are simulated.
Coded distributed computing was recently introduced to mitigate the effect of stragglers on distributed computing. This paper combines ideas of approximate computing with coded computing to further accelerate computation. We propose successive approximation coding (SAC) techniques that realize a tradeoff between accuracy and speed, allowing the distributed computing system to produce approximations that increase in accuracy over time. If a sufficient number of compute nodes finish their tasks, SAC exactly recovers the desired computation. We theoretically provide design guidelines for our SAC techniques, and numerically show that SAC achieves a better accuracy-speed tradeoff in comparison with previous methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.