Optimizing nonlinear activation function for convolutional neural networks

Varshney, Munender; Singh, Pravendra

doi:10.1007/s11760-021-01863-z

Cited by 40 publications

(29 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In Machine learning, saturated activation functions like Sigmoidal, tanh functions are used. The new unsaturated variant, the ReLU has been prominent in the CNN’s as per its best outcomes [ 18 ]. The voluminous literature applying the same activation function ReLU invariably to all types of data (right from text-based data mining to computer vision) and for all applications indicates a research gap.…”

Section: Literature Surveymentioning

confidence: 99%

“…As there is a stack of spatial domain images, wavelet sub-bands (inclusive of LF scaling functions and HF wavelet functions) a need arises to formulate activation function for each “domain” and “sub-bands.” In the frequency domain, both positive and negative coefficients are significant [ 17 ]. Varshney et al, in [ 18 ], emphasize the negative spectral coefficients and the importance to save them from dismissal by activations. But the bottleneck with the conventional spatial domain activation functions is that they nullify the negative coefficients.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Retinopathy grading with deep learning and wavelet hyper-analytic activations

Raja¹,

Balaji

2022

Vis Comput

View full text Add to dashboard Cite

Recent developments reveal the prominence of Diabetic Retinopathy (DR) grading. In the past few decades, Wavelet-based DR classification has shown successful impacts and the Deep Learning models, like Convolutional Neural Networks (CNN’s), have evolved in offering the highest prediction accuracy. In this work, the features of the input image are enhanced with the integration of Multi-Resolution Analysis (MRA) and a CNN framework without costing more convolution filters. The bottleneck with conventional activation functions, used in CNN’s, is the nullification of the feature maps that are negative in value. In this work, a novel Hyper-analytic Wavelet ( HW) phase activation function is formulated with unique characteristics for the wavelet sub-bands. Instead of dismissal, the function transforms these negative coefficients that correspond to significant edge feature maps . The hyper-analytic wavelet phase forms the imaginary part of the complex activation. And the hyper-parameter of the activation function is selected such that the corresponding magnitude spectrum produces monotonic and effective activations. The performance of 3 CNN models (1 custom, shallow CNN, ResNet with Soft attention, Alex Net for DR) with spatial–Wavelet quilts is better. With the spatial–Wavelet quilts, the Alex Net for DR has an improvement with an 11% of accuracy level (from 87 to 98%). The highest accuracy level of 98% and the highest Sensitivity of 99% are attained through Modified Alex Net for DR. The proposal also illustrates the visualization of the negative edge preservation with assumed image patches. From this study, the researcher infers that models with spatial–Wavelet quilts, with the hyper-analytic activations, have better generalization ability. And the visualization of heat maps provides evidence of better learning of the feature maps from the wavelet sub-bands.

show abstract

Section: Literature Surveymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Retinopathy grading with deep learning and wavelet hyper-analytic activations

Raja¹,

Balaji

2022

Vis Comput

View full text Add to dashboard Cite

show abstract

“…In the first batch of hidden layers, a Rectified Linear Unit (ReLU, Agarap, 2018) is used as the activation function for all layers, as is common in many contemporary neural networks (Varshney and Singh, 2021). In the second batch of hidden layers, Leaky ReLUs (Clevert et al, 2015) are used to combat the somewhat common dying ReLU (Lu et al, 2019) problem that was encountered when initially attempting to build the network with standard ReLUs.…”

Section: A1 Activation Functionsmentioning

confidence: 99%

On The Prediction of Landslide Occurrences and Sizesvia Hierarchical Neural Networks

Aguilera

Lombardo

Tanyaş

et al. 2022

Preprint

View full text Add to dashboard Cite

For more than three decades, the scientific community that studies landslides through data-driven models has focused on estimating where landslides occur across a given landscape. This concept is widely known as landslide susceptibility. And, it has seen a vast improvement from old bivariate statistical techniques to modern deep learning routines. Despite all these advancements, no spatially-explicit data-driven model is currently capable of also predicting how large landslides may be once they trigger in a specific study area. In this work, we exploit a model architecture that has already found a number of applications in landslide susceptibility. Specifically, we opt for the use of Neural Network (NN). But, instead of focusing exclusively on where landslides may occur, we extend this paradigm to also spatially predict classes of landslide sizes. As a result, we keep the traditional binary classification paradigm but we make use of it to complement the susceptibility estimates with a crucial information for landslide hazard assessment. We will refer to this model as Hierarchical Neural Network (HNN) throughout the manuscript. To test this analytical protocol, we use the Nepalese area where the Gorkha earthquake induced tens of thousands of landslides in 2014. The results we obtain are quite promising. The component of our HNN that estimates the susceptibility outperforms a binomial Generalized Linear Model (GLM) baseline we used as benchmark. We did this for a GLM represents the most common classifier in the landslide literature. Most importantly, our HNN also suitably performed across the entire procedure. As a result, the landslide-area-class prediction returned not just a single susceptibility map, as per tradition. But, it also produced several informative maps on the expected landslide size classes. Our vision is for administrations to consult these suite of model outputs and maps to better assess the risk to local communities and infrastructure. And, to promote the diffusion of our HNN, we are sharing the data and codes in the supplementary material in the hope that we would stimulate others to replicate similar analyses.

show abstract

“…The parameters can be reduced to a certain extent to avoid repeated convolution cores. Pooling [21] refers to the image after convolution which is downsampled to shorten the size of the image, ensure that there is no overfitting, and improve the efficiency of calculation. The processing flow of the convolution neural network is shown in Fig.…”

Section: Convolutional Neural Networkmentioning

confidence: 99%

Image fusion algorithm in Integrated Space-Ground-Sea Wireless Networks of B5G

Cui

Wang

et al. 2021

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

In recent years, in Space-Ground-Sea Wireless Networks, the rapid development of image recognition also promotes the development of images fusion. For example, the content of a single-mode medical image is very single, and the fused image contains more image information, which provides a more reliable basis for diagnosis. However, in wireless communication and medical image processing, the image fusion effect is poor and the efficiency is low. To solve this problem, an image fusion algorithm based on fast finite shear wave transform and convolutional neural network is proposed for wireless communication in this paper. This algorithm adopts the methods such as fast finite shear wave transform (FFST), reducing the dimension of the convolution layer, and the inverse process of fast finite shear wave transform. The experimental results show that the algorithm has a very good effect in both objective indicators and subjective vision, and it is also very feasible in wireless communication.

show abstract

Optimizing nonlinear activation function for convolutional neural networks

Cited by 40 publications

References 18 publications

Retinopathy grading with deep learning and wavelet hyper-analytic activations

Retinopathy grading with deep learning and wavelet hyper-analytic activations

On The Prediction of Landslide Occurrences and Sizesvia Hierarchical Neural Networks

Image fusion algorithm in Integrated Space-Ground-Sea Wireless Networks of B5G

Contact Info

Product

Resources

About