The Computational Limits of Deep Learning

Thompson, Neil; Greenewald, Kristjan; Lee, Keeheon; Manso, Gabriel F.

doi:10.48550/arxiv.2007.05558

Cited by 114 publications

(115 citation statements)

References 44 publications

(56 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Unfortunately, despite all of the recent success, modern hardware still greatly restricts the practicality of certain machine learning models. Machine learning, deep learning in particular, can be very computationally expensive, sometimes requiring hours, days, or even months of training time on today's computers [4]. Moreover, conventional computers are beginning to approach physical limitations that will slow their improvements in years to come [5].…”

Section: Introductionmentioning

confidence: 99%

A Hybrid Quantum-Classical Neural Network Architecture for Binary Classification

Arthur¹,

Date²

2022

Preprint

View full text Add to dashboard Cite

Deep learning is one of the most successful and far-reaching strategies used in machine learning today. However, the scale and utility of neural networks is still greatly limited by the current hardware used to train them. These concerns have become increasingly pressing as conventional computers quickly approach physical limitations that will slow performance improvements in years to come. For these reasons, scientists have begun to explore alternative computing platforms, like quantum computers, for training neural networks. In recent years, variational quantum circuits have emerged as one of the most successful approaches to quantum deep learning on noisy intermediate scale quantum devices. We propose a hybrid quantum-classical neural network architecture where each neuron is a variational quantum circuit. We empirically analyze the performance of this hybrid neural network on a series of binary classification data sets using a simulated universal quantum computer and a state of the art universal quantum computer. On simulated hardware, we observe that the hybrid neural network achieves roughly 10% higher classification accuracy and 20% better minimization of cost than an individual variational quantum circuit. On quantum hardware, we observe that each model only performs well when the qubit and gate count is sufficiently small.

show abstract

Section: Introductionmentioning

confidence: 99%

A Hybrid Quantum-Classical Neural Network Architecture for Binary Classification

Arthur¹,

Date²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…It is estimated that inference accounts for up to 90% of the costs [Thomas, 2020]. There are several studies about training computation and its environmental impact [Amodei and Hernandez, 2018, Gholami et al, 2021a, Canziani et al, 2017, Li et al, 2016, Anthony et al, 2020, Thompson et al, 2020 but there are very few focused on inference costs and their associated energy consumption.…”

Section: Introductionmentioning

confidence: 99%

Compute and Energy Consumption Trends in Deep Learning Inference

Desislavov,

Martínez-Plumed,

Hernández-Orallo

2021

Preprint

View full text Add to dashboard Cite

The progress of some AI paradigms such as deep learning is said to be linked to an exponential growth in the number of parameters. There are many studies corroborating these trends, but does this translate into an exponential increase in energy consumption? In order to answer this question we focus on inference costs rather than training costs, as the former account for most of the computing effort, solely because of the multiplicative factors. Also, apart from algorithmic innovations, we account for more specific and powerful hardware (leading to higher FLOPS) that is usually accompanied with important energy efficiency optimisations. We also move the focus from the first implementation of a breakthrough paper towards the consolidated version of the techniques one or two year later. Under this distinctive and comprehensive perspective, we study relevant models in the areas of computer vision and natural language processing: for a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated. The only caveat is, yet again, the multiplicative factor, as future AI increases penetration and becomes more pervasive.

show abstract

“…On the other hand, it is understood that artificial intelligence systems need not mimic the low-level architecture of the brain cells, but rather get inspirations from abstract properties of human intelligence [15]. This becomes especially important when considering that adopting black-box deep neural network architectures results in using overly complex models of a great many parameters in the expense of time, energy, data, memory and computational resources [16], [17], Even in the applications when complexity is not an issue, the lack of interpretability and mathematical understanding, and the vulnerability to small perturbations and adversarial attacks [18]- [20], has led to an emerging hesitation in the use of deep learning models outside common benchmark datasets [21], [22], and, especially, in security critical applications. These models are hard to analyze with current mathematical tools, hard to train with current optimization methods, and their design relies solely in experimental heuristics.…”

Section: Introductionmentioning

confidence: 99%