2021
DOI: 10.48550/arxiv.2110.06081
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On Expressivity and Trainability of Quadratic Networks

Abstract: Inspired by diversity of biological neurons, quadratic artificial neurons can play an important role in deep learning models. The type of quadratic neurons of our interest replaces the inner-product operation in the conventional neuron with a quadratic function. Despite promising results so far achieved by networks of quadratic neurons, there are important issues not well addressed. Theoretically, the superior expressivity of a quadratic network over either a conventional network or a conventional network via … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 32 publications
(39 reference statements)
0
5
0
Order By: Relevance
“…The number of training epochs is 200. For quadratic networks, we adopt the ReLinear training strategy [9], where the learning rate of quadratic terms is set to 1 × 10 −4 . We adopt Adam as an optimizer.…”
Section: Gaussian Mixture Datamentioning
confidence: 99%
See 3 more Smart Citations
“…The number of training epochs is 200. For quadratic networks, we adopt the ReLinear training strategy [9], where the learning rate of quadratic terms is set to 1 × 10 −4 . We adopt Adam as an optimizer.…”
Section: Gaussian Mixture Datamentioning
confidence: 99%
“…We use Adam as an optimizer for training all methods. In particular, for QUNet, we adopt the ReLinear strategy as [9], where the learning rate of quadratic terms is set to 1 × 10 −4 . We also employ the gradient clip norm method with a maximum norm value of 0.01 to constrain the over-growth of weights.…”
Section: Efficiency On the Cell Datasetmentioning
confidence: 99%
See 2 more Smart Citations
“…). Furthermore, to facilitate the training of the quadratic parameters and guarantee the model convergence, Fan et al introduced a Relinear strategy which includes a special initialization on the quadratic terms: w g = 0, b g = 1, w b = 0, and b b = 0, assisted by a shrunk small learning rate to prevent magnitude explosion [31].…”
mentioning
confidence: 99%