Shizhao Chen scite author profile

Sparse matrix vector multiplication (SpMV) is one of the most common operations in scientific and highperformance applications, and is often responsible for the application performance bottleneck. While the sparse matrix representation has a significant impact on the resulting application performance, choosing the right representation typically relies on expert knowledge and trial and error. This paper provides the first comprehensive study on the impact of sparse matrix representations on two emerging many-core architectures: the Intel's Knights Landing (KNL) XeonPhi and the ARM-based FT-2000Plus (FTP). Our large-scale experiments involved over 9,500 distinct profiling runs performed on 956 sparse datasets and five mainstream SpMV representations. We show that the best sparse matrix representation depends on the underlying architecture and the program input. To help developers to choose the optimal matrix representation, we employ machine learning to develop a predictive model. Our model is first trained offline using a set of training examples. The learned model can be used to predict the best matrix representation for any unseen input for a given architecture. We show that our model delivers on average 95% and 91% of the best available performance on KNL and FTP respectively, and it achieves this with no runtime profiling overhead.

show abstract

FlowGAN: A Conditional Generative Adversarial Network for Flow Prediction in Various Conditions

Chen

Gao

et al. 2020

View full text Add to dashboard Cite

Characterizing Scalability of Sparse Matrix–Vector Multiplications on Phytium FT-2000+

Chen

Fang

et al. 2019

Int J Parallel Prog

View full text Add to dashboard Cite

Understanding the scalability of parallel programs is crucial for software optimization and hardware architecture design. As HPC hardware is moving towards many-core design, it becomes increasingly difficult for a parallel program to make effective use of all available processor cores. This makes scalability analysis increasingly important. This paper presents a quantitative study for characterizing the scalability of sparse matrix-vector multiplications (SpMV) on Phytium FT-2000+, an ARM-based HPC many-core architecture. We choose SpMV as it is a common operation in scientific and HPC applications. Due to the newness of ARM-based many-core architectures, there is little work on understanding the SpMV scalability on such hardware design. To close the gap, we carry out a largescale empirical evaluation involved over 1,000 representative SpMV datasets. We show that, while many computation-intensive SpMV applications contain extensive parallelism, achieving a linear speedup is nontrivial on Phytium FT-2000+. To better understand what software and hardware parameters are most important for determining the scalability of a given SpMV kernel, we develop a performance analytical model based on the regression tree. We show that our model is highly effective in characterizing SpMV scalability, offering useful insights to help application developers for better optimizing SpMV on an emerging HPC architecture.

show abstract

Optimizing Sparse Matrix–Vector Multiplications on an ARMv8-based Many-Core Architecture

Chen

Fang

Chen

et al. 2019

Int J Parallel Prog

View full text Add to dashboard Cite

Sparse matrix-vector multiplications (SpMV) are common in scientific and HPC applications but are hard to be optimized. While the ARMv8based processor IP is emerging as an alternative to the traditional x64 HPC processor design, there is little study on SpMV performance on such new many-cores. To design efficient HPC software and hardware, we need to understand how well SpMV performs. This work develops a quantitative approach to characterize SpMV performance on a recent ARMv8-based many-core architecture, Phytium FT-2000 Plus (FTP). We perform extensive experiments involved over 9,500 distinct profiling runs on 956 sparse datasets and five mainstream sparse matrix storage formats, and compare FTP against the Intel Knights Landing many-core. We experimentally show that picking the optimal sparse matrix storage format and parameters is non-trivial as the correct decision requires expert knowledge of the input matrix and the hardware. We address the problem by proposing a machine learning based model that predicts the best storage format and parameters using input matrix features. The model automatically specializes to the many-core architectures we considered. The experimental results show that our approach achieves on average 93% of the best-available performance without incurring runtime profiling overhead.

show abstract

FlowDNN: a physics-informed deep neural network for fast and accurate flow prediction

Chen

Gao

et al. 2022

Front Inform Technol Electron Eng

View full text Add to dashboard Cite

Flexible ranking extreme learning machine based on matrix-centering transformation

Chen

et al. 2018

View full text Add to dashboard Cite

Linear maps between operator algebras preserving certain spectral functions

Cao¹,

Chen²

2014

Banach J. Math. Anal.

View full text Add to dashboard Cite

Let H be an infinite dimensional complex Hilbert space and let φ be a surjective linear map on B(H) with φ(I)−I ∈ K(H), where K(H) denotes the closed ideal of all compact operators on H. If φ preserves the set of upper semi-Weyl operators and the set of all normal eigenvalues in both directions, then φ is an automorphism of the algebra B(H). Also the relation between the linear maps preserving the set of upper semi-Weyl operators and the linear maps preserving the set of left invertible operators is considered.

show abstract

A Novel Un-Supervised GAN for Fundus Image Enhancement with Classification Prior Loss

2022

View full text Add to dashboard Cite

Fundus images captured for clinical diagnosis usually suffer from degradation factors due to variation in equipment, operators, or environment. These degraded fundus images need to be enhanced to achieve better diagnosis and improve the results of downstream tasks. As there is no paired low- and high-quality fundus image, existing methods mainly focus on supervised or semi-supervised learning methods for color fundus image enhancement (CFIE) tasks by utilizing synthetic image pairs. Consequently, domain gaps between real images and synthetic images arise. With respect to existing unsupervised methods, the most important low scale pathological features and structural information in degraded fundus images are prone to be erased after enhancement. To solve these problems, an unsupervised GAN is proposed for CFIE tasks utilizing adversarial training to enhance low quality fundus images. Synthetic image pairs are no longer required during the training. A specially designed U-Net with skip connection in our enhancement network can effectively remove degradation factors while preserving pathological features and structural information. Global and local discriminators adopted in the GAN lead to better illumination uniformity in the enhanced fundus image. To better improve the visual quality of enhanced fundus images, a novel non-reference loss function based on a pretrained fundus image quality classification network was designed to guide the enhancement network to produce high quality images. Experiments demonstrated that our method could effectively remove degradation factors in low-quality fundus images and produce a competitive result compared with previous methods in both quantitative and qualitative metrics.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shizhao Chen

Adaptive Optimization of Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures

FlowGAN: A Conditional Generative Adversarial Network for Flow Prediction in Various Conditions

Characterizing Scalability of Sparse Matrix–Vector Multiplications on Phytium FT-2000+

Optimizing Sparse Matrix–Vector Multiplications on an ARMv8-based Many-Core Architecture

FlowDNN: a physics-informed deep neural network for fast and accurate flow prediction

Flexible ranking extreme learning machine based on matrix-centering transformation

Linear maps between operator algebras preserving certain spectral functions

A Novel Un-Supervised GAN for Fundus Image Enhancement with Classification Prior Loss

Contact Info

Product

Resources

About