Yusuke Tsuzuku scite author profile

Yusuke Tsuzuku

4Publications

69Citation Statements Received

71Citation Statements Given

How they've been cited

How they cite others

Affiliations

The University of Tokyo, RIKEN

Publications

Order By: Most citations

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

Tsuzuku

Sato

2019

View full text Add to dashboard Cite

Data-agnostic quasi-imperceptible perturbations on inputs are known to degrade recognition accuracy of deep convolutional networks severely. This phenomenon is considered to be a potential security issue. Moreover, some results on statistical generalization guarantees indicate that the phenomenon can be a key to improve the networks' generalization. However, the characteristics of the shared directions of such harmful perturbations remain unknown. Our primal finding is that convolutional networks are sensitive to the directions of Fourier basis functions. We derived the property by specializing a hypothesis of the cause of the sensitivity, known as the linearity of neural networks, to convolutional networks and empirically validated it. As a by-product of the analysis, we propose an algorithm to create shift-invariant universal adversarial perturbations available in black-box settings.

show abstract

Variance-based Gradient Compression for Efficient Distributed Deep Learning

Tsuzuku¹,

Imachi²,

Akiba³

2018

Preprint

View full text Add to dashboard Cite

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently communicate gradients, causing severe bottlenecks, especially on lower bandwidth connections. A few methods have been proposed to compress gradient for efficient communication, but they either suffer a low compression ratio or significantly harm the resulting model accuracy, particularly when applied to convolutional neural networks. To address these issues, we propose a method to reduce the communication overhead of distributed deep learning. Our key observation is that gradient updates can be delayed until an unambiguous (high amplitude, low variance) gradient has been calculated. We also present an efficient algorithm to compute the variance with negligible additional cost. We experimentally show that our method can achieve very high compression ratio while maintaining the result model accuracy. We also analyze the efficiency using computation and communication cost models and provide the evidence that this method enables distributed deep learning for many scenarios with commodity environments.

show abstract

Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis

Tsuzuku¹,

Sato²,

Sugiyama³

2019

Preprint

View full text Add to dashboard Cite

The notion of flat minima has played a key role in the generalization studies of deep learning models. However, existing definitions of the flatness are known to be sensitive to the rescaling of parameters. The issue suggests that the previous definitions of the flatness might not be a good measure of generalization, because generalization is invariant to such rescalings. In this paper, from the PAC-Bayesian perspective, we scrutinize the discussion concerning the flat minima and introduce the notion of normalized flat minima, which is free from the known scale dependence issues. Additionally, we highlight the scale dependence of existing matrix-norm based generalization error bounds similar to the existing flat minima definitions. Our modified notion of the flatness does not suffer from the insufficiency, either, suggesting it might provide better hierarchy in the hypothesis class.

show abstract

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

Tsuzuku¹,

Sato²

2018

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yusuke Tsuzuku

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

Variance-based Gradient Compression for Efficient Distributed Deep Learning

Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

Contact Info

Product

Resources

About