Proceedings of the 2019 ACM Southeast Conference 2019
DOI: 10.1145/3299815.3314450
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Study on Generalizations of the ReLU Activation Function

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 58 publications
(40 citation statements)
references
References 7 publications
0
24
0
Order By: Relevance
“…Though the conventional “s-functions” such as “ logsig ” or “ tansig ” are widely used, they suffer from the “data-saturation” issue ( Banerjee et al., 2019 ). In other words, when the range of the input for testing is significantly larger than that for the network training, the output of the “s-functions” will be close to 1 and exhibits negligible changes to the input variations.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Though the conventional “s-functions” such as “ logsig ” or “ tansig ” are widely used, they suffer from the “data-saturation” issue ( Banerjee et al., 2019 ). In other words, when the range of the input for testing is significantly larger than that for the network training, the output of the “s-functions” will be close to 1 and exhibits negligible changes to the input variations.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, functions with an unbounded output range should be selected. Noting that the first-order derivative of the “ voltage vs time ” trajectory is important for calculating the IC trajectory, some widely used functions such as Reflected Linear Unit (ReLU) ( Banerjee et al., 2019 ) are not suitable, as they are not smooth at x = 0. In short, the activation function should be continuous, derivable, and exhibit no data saturation effect.…”
Section: Methodsmentioning
confidence: 99%
“…The role of Equation ( 5) is to sum all the maps related to the nucleus adjacent to the same spatial position, and the notation N is the integer assigned to all the cores in a layer. Therefore, unlike the standard model of the algorithm, it has a much smaller number of classes, even of those of interest such as cars, buses, bicycles [76,77]. In this direction, there are studies and niche systems such as Cyber-Physical, dedicated to the analysis of all types of bicycle, moped, or other devices in this category.…”
Section: Description Of External Environment and Practical Scenariosmentioning
confidence: 99%
“…ReLU activation functions are used to introducing non-linearity into the ANN model so that the ANN can progressively learn more effective feature representations. ReLU is the most commonly used activation in all convolutional neural networks or deep learning models [31]. ReLU is a non-linear function that can backpropagate the errors easily.…”
Section: ) Deep Stacked Multilayered Perceptron (Ds-mlp)mentioning
confidence: 99%