Luanzheng Guo scite author profile

Luanzheng Guo

5Publications

43Citation Statements Received

149Citation Statements Given

How they've been cited

How they cite others

151

149

Affiliations

Pacific Northwest National Laboratory, University of California, Merced, Nanchang Hangkong University

Publications

Order By: Most citations

Reinit$$^{++}$$: Evaluating the Performance of Global-Restart Recovery Methods for MPI Fault Tolerance

Georgakoudis

Guo

Laguna

2020

View full text Add to dashboard Cite

Scaling supercomputers comes with an increase in failure rates due to the increasing number of hardware components. In standard practice, applications are made resilient through checkpointing data and restarting execution after a failure occurs to resume from the latest checkpoint. However, redeploying an application incurs overhead by tearing down and reinstating execution, and possibly limiting checkpointing retrieval from slow permanent storage. In this paper we present Reinit ++ , a new design and implementation of the Reinit approach for global-restart recovery, which avoids application re-deployment. We extensively evaluate Reinit ++ contrasted with the leading MPI fault-tolerance approach of ULFM, implementing globalrestart recovery, and the typical practice of restarting an application to derive new insight on performance. Experimentation with three different HPC proxy applications made resilient to withstand process and node failures shows that Reinit ++ recovers much faster than restarting, up to 6×, or ULFM, up to 3×, and that it scales excellently as the number of MPI processes grows.

show abstract

FlipTracker: Understanding Natural Error Resilience in HPC Applications

Guo

Liu

Laguna

et al. 2018

View full text Add to dashboard Cite

As high-performance computing systems scale in size and computational power, the danger of silent errors, i.e., errors that can bypass hardware detection mechanisms and impact application state, grows dramatically. Consequently, applications running on HPC systems need to exhibit resilience to such errors. Previous work has found that, for certain codes, this resilience can come for free, i.e., some applications are naturally resilient, but few studies have shown the code patterns-combinations or sequences of computations-that make an application naturally resilient. In this paper, we present FlipTracker, a framework designed to extract these patterns using fine-grained tracking of error propagation and resilience properties, and we use it to present a set of computation patterns that are responsible for making representative HPC applications naturally resilient to errors. This not only enables a deeper understanding of resilience properties of these codes, but also can guide future application designs towards patterns with natural resilience.

show abstract

A High Performance Sparse Tensor Algebra Compiler in MLIR

Tian

Guo

et al. 2021

View full text Add to dashboard Cite

Chessboard corner detection under image physical coordinate

Chu¹,

Guo²,

Wang³

2013

Optics & Laser Technology

View full text Add to dashboard Cite

Indoor frame recovering via line segments refinement and voting

Chu

Guo

Wang³

et al. 2013

View full text Add to dashboard Cite

Frame structure estimation from line segments is an important yet challenging problem in understanding indoor scenes. In practice, line segment extraction can be affected by occlusions, illumination variations, and weak object boundaries. To address this problem, an approach for frame structure recovery based on line segment refinement and voting is proposed. We refined line segments by the revising, connecting, and adding operations. We then propose an iterative voting mechanism for selecting refined line segments, where a cross ratio constraint is enforced to build crab-like models. Our algorithm outperforms state-of-the-art approaches, especially when considering complex indoor scenes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Luanzheng Guo

Reinit$$^{++}$$: Evaluating the Performance of Global-Restart Recovery Methods for MPI Fault Tolerance

FlipTracker: Understanding Natural Error Resilience in HPC Applications

A High Performance Sparse Tensor Algebra Compiler in MLIR

Chessboard corner detection under image physical coordinate

Indoor frame recovering via line segments refinement and voting

Contact Info

Product

Resources

About