2021
DOI: 10.48550/arxiv.2104.14672
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Analytical bounds on the local Lipschitz constants of ReLU networks

Abstract: In this paper, we determine analytical upper bounds on the local Lipschitz constants of feedforward neural networks with ReLU activation functions. We do so by deriving Lipschitz constants and bounds for ReLU, affine-ReLU, and max pooling functions, and combining the results to determine a networkwide bound. Our method uses several insights to obtain tight bounds, such as keeping track of the zero elements of each layer, and analyzing the composition of affine and ReLU functions. Furthermore, we employ a caref… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 5 publications
(7 reference statements)
1
4
0
Order By: Relevance
“…However, the bound is not strict since for the first architecture the bound is 34, 80 and the quotient lies between 1 and Hence, the constant is a worst-case scenario of the dilation caused by the function. Notice that this is consistent with the (strict) bounds of [AM21], that are of order 10 9 . Furthermore, there seems to be little to no difference in distance dilation among the four different architectures, see table 2, even though the constant of the algorithm does change.…”
Section: Boston Datasetsupporting
confidence: 86%
See 3 more Smart Citations
“…However, the bound is not strict since for the first architecture the bound is 34, 80 and the quotient lies between 1 and Hence, the constant is a worst-case scenario of the dilation caused by the function. Notice that this is consistent with the (strict) bounds of [AM21], that are of order 10 9 . Furthermore, there seems to be little to no difference in distance dilation among the four different architectures, see table 2, even though the constant of the algorithm does change.…”
Section: Boston Datasetsupporting
confidence: 86%
“…The results in [AM21] suggest that learners can be seen as Lipschitz function from one (pseudo)metric space to another. Thus, the answer to the first question of this section seems to be in the very definition of the Lipschitz function: the constant.…”
Section: Enriched Datasetsmentioning
confidence: 94%
See 2 more Smart Citations
“…However, again this approach does not scale well with the number of layers. While these works estimate global Lipschitz constants, it was shown in [20] that estimating local Lipschitz can be done more efficiently.…”
Section: A Related Workmentioning
confidence: 99%