Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums

Montúfar, Guido; Ren, Ya-Tao; Zhang, Leon

doi:10.48550/arxiv.2104.08135

Cited by 4 publications

(9 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In fact, there is a one-to-one correspondence between elements of CCPWL n and Newt n , which is nicely compatible with some (functional and polyhedral) operations. This correspondence has been studied before in the tropical geometry [Maclagan andSturmfels, 2015, Joswig, 2022], convex geometry 1 [Hiriart-Urruty and Lemaréchal, 1993], as well as neural network literature [Zhang et al, 2018, Charisopoulos and Maragos, 2018, Alfarra et al, 2020, Montúfar et al, 2021. We summarize the key findings of this correspondence relevant to our work in the following proposition: Proposition 4.5.…”

Section: Extended Newton Polyhedra Of Convex Cpwl Functionsmentioning

confidence: 62%

“…The purpose of this section is to prove that for fixed dimension n, the required width for exact, depth-minimal representation of a CPWL function can be polynomially bounded in the number p of affine pieces; in particular p O(n 2 ) . This is closely related to works that bound the number of linear pieces of an NN as a function of the size , Raghu et al, 2017, Montúfar et al, 2021. It can also be seen as a counterpart, in the context of exact representations, to quantitative universal approximation theorems that bound the number of neurons required to achieve a certain approximation guarantee; see, e.g., Barron [1993Barron [ , 1994, Mhaskar [1993], Pinkus [1999], Mhaskar [1996], Mhaskar and Micchelli [1995].…”

Section: A Width Bound For Nns With Small Depthmentioning

confidence: 78%

“…Linear regions of these functions correspond to vertices of so-called Newton polytopes associated with these tropical polynomials. Applications of this correspondence include bounding the number of linear regions of a neural network [Zhang et al, 2018, Charisopoulos and Maragos, 2018, Montúfar et al, 2021 and understanding decision boundaries [Alfarra et al, 2020]. In Section 4 we present a novel application of tropical concepts to understand neural networks.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Towards Lower Bounds on the Depth of ReLU Neural Networks

Hertrich¹,

Basu²,

Summa³

et al. 2021

Preprint

View full text Add to dashboard Cite

We contribute to a better understanding of the class of functions that is represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning tasks. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). This problem has potential impact on algorithmic and statistical aspects because of the insight it provides into the class of functions represented by neural hypothesis classes. However, to the best of our knowledge, this question has not been investigated in the neural network literature. We also present upper bounds on the sizes of neural networks required to represent functions in these neural hypothesis classes.

show abstract

Section: Extended Newton Polyhedra Of Convex Cpwl Functionsmentioning

confidence: 62%

Section: A Width Bound For Nns With Small Depthmentioning

confidence: 78%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Towards Lower Bounds on the Depth of ReLU Neural Networks

Hertrich¹,

Basu²,

Summa³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…However, deeper networks require much less neurons to reach the same expressive power, yielding a potential theoretical explanation of the dominance of deep networks in practice [7,29,42,44,53,62,65,68,79,80,83]. Other related work includes counting and bounding the number of linear regions [43,59,60,64,65,74], classifying the set of functions exactly representable by different architectures [7,23,46,47,61,86], or analyzing the memorization capacity of ReLU networks [82,84,85].…”

Section: Neural Networkmentioning

confidence: 99%

Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete

Bertschinger¹,

Hertrich²,

Jungeblut³

et al. 2022

Preprint

View full text Add to dashboard Cite

We consider the algorithmic problem of finding the optimal weights and biases for a twolayer fully connected neural network to fit a given set of data points. This problem is known as empirical risk minimization in the machine learning community. We show that the problem is ∃R-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Our results hold even if the following restrictions are all added simultaneously.• There are exactly two output neurons.• There are exactly two input neurons.• The data has only 13 different labels.• The number of hidden neurons is a constant fraction of the number of data points.• The target training error is zero.• The ReLU activation function is used.

show abstract

“…Several research directions have been explored at the interface between tropical geometry, probablity theory and machine learning. These include studies of the tropicalization of stochastic processes (Akian et al, 1994) or of Gaussian measures (Tran, 2020), tropical support vector machines (Yoshida et al, 2021), tropical principal component analysis (Yoshida et al, 2019) inspired by phylogenetic studies, quantification of the expressivity of deep neural networks (Zhang et al, 2018;Montúfar et al, 2021) or their approximation (Calafiore et al, 2020) through tropical methods. A survey of some of these approaches can be found in Maragos et al (2021).…”

Section: Introductionmentioning

confidence: 99%

Tropical reproducing kernels and optimization

Aubin-Frankowski¹,

Gaubert²

2022

Preprint

View full text Add to dashboard Cite

Hilbertian kernel methods and their positive semidefinite kernels have known an extensive use in various fields of applied mathematics and machine learning, owing to their several equivalent characterizations. We here unveil an analogy with concepts from tropical geometry, proving that tropical positive semidefinite kernels are also endowed with equivalent viewpoints, stemming from Fenchel-Moreau conjugations. We give a tropical analogue of Aronszajn theorem, showing that these kernels correspond to a feature map, define monotonous operators, and generate max-plus function spaces endowed with a reproducing property. They furthermore include all the Hilbertian kernels classically studied as well as Monge arrays. However, two relevant notions of tropical reproducing kernels must be distinguished, based either on linear or sesquilinear interpretations. The sesquilinear interpretation is the most expressive one, since reproducing spaces then encompass classical max-plus spaces, such as those of (semi)convex functions. In contrast, in the linear interpretation, the reproducing kernels are characterized by a restrictive condition, von Neumann regularity.

show abstract

Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums

Cited by 4 publications

References 40 publications

Towards Lower Bounds on the Depth of ReLU Neural Networks

Towards Lower Bounds on the Depth of ReLU Neural Networks

Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete

Tropical reproducing kernels and optimization

Contact Info

Product

Resources

About