WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for Noise-Robust Image Classification

Li, Qiufu; Shen, Linlin; Guo, Sheng; Lai, Zhihui

doi:10.1109/tip.2021.3101395

Cited by 58 publications

(25 citation statements)

References 57 publications

(98 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The pooling layer generally follows the convolution layer to reduce the width and height of the feature map but does not alter the depth of the feature map. Common pooling layers include average pooling and max pooling layers 48 Fully connected layer …”

Section: Methodsmentioning

confidence: 99%

“…Common pooling layers include average pooling and max pooling layers. 48 • Fully connected layer "Fully connected" means that the output of the previous layer is distinctly connected to individual neurons in the next layer. 49 The aim of the fully connected layer is to use the high-level feature of the input image produced by convolutional and pooling layers to classify the input image based on the training dataset.…”

Section: • Pooling Layermentioning

confidence: 99%

See 1 more Smart Citation

Metaheuristics‐optimized deep learning to predict generation of sustainable energy from rooftop plant microbial fuel cells

Chou

Cheng

Liu

et al. 2022

Intl J of Energy Research

View full text Add to dashboard Cite

Plant microbial fuel cells (PMFCs) are an emergent green-energy technology that continuously converts solar energy into electricity. Placing PMFCs on the roofs of urban buildings can help to create green urban environments even as they generate power. The power generation performance of PMFCs is affected by a range of environmental factors, so their power generation capacity is difficult to estimate. To develop an artificial intelligence model to forecast PMFC power generation accurately, relevant results obtained using shallow and deep learning techniques are compared for the first time. Once deep learning techniques had been identified as superior for this purpose, they were used with a bio-inspired optimization algorithm to dynamically setting the model hyperparameters. The developed model can also be applied to estimate the power generation capacity of PMFC devices in the future. The model was trained using data collected from sensors in a site experiment that was carried out using PMFCs embedded with Chinese pennisetumin (Pennisetum alopecuroides), narrowleaf cattail (Typha angustifolia), dwarf rotala (Rotala rotundifolia), and no plant as a control group. The original data of device parameters, environmental parameters, and the measured power generation of PMFCs in numerical form were applied to train shallow learning and time-series deep learning models. Meanwhile, the state-of-the-art sliding window technique was used to establish a numerical matrix, which was converted into a 2D image-like format to represent inputs for deep convolutional neural network (CNN) models. The accuracy in predicting the power generation capacity of PMFC devices showed that EfficientNet, an advanced type of CNN, was the best model among the shallow and deep learning techniques. These analytical results demonstrate the superior performance of deep CNNs in learning image features and their consequent suitability for constructing PMFC power generation forecasting models. To enhance the generalization performance of CNN, a

show abstract

Section: Methodsmentioning

confidence: 99%

Section: • Pooling Layermentioning

confidence: 99%

Metaheuristics‐optimized deep learning to predict generation of sustainable energy from rooftop plant microbial fuel cells

Chou

Cheng

Liu

et al. 2022

Intl J of Energy Research

View full text Add to dashboard Cite

show abstract

“…They modified and combined the WLFD by generating wavelet pyramids, keypoint localization, and descriptors. Another study in [25] combined a wavelet with the convolutional neural network (WaveCNet) to produce better noise-robustness. In [26], a scale and rotation invariant wavelet feature transform was proposed using a biorthogonal wavelet and combined only two sub-bands with SIFT.…”

Section: A Literature Reviewmentioning

confidence: 99%

“…The set of all unit vectors in the tangent plane of S is a circle, hence a function of κ in (25). The maximum and minimum values of κ 1 and κ 2 of κ are the principal curvatures of surface S at p. The mean curvature H (26) and Gaussian curvature K (27) can be calculated from the principal curvature of κ 1 and κ 2 [45].…”

Section: B Surface Curvaturementioning

confidence: 99%

Curvature Best Basis: A Novel Criterion to Dynamically Select a Single Best Basis as the Extracted Feature for Periocular Recognition

et al. 2022

View full text Add to dashboard Cite

Aiming at the problems in the best basis selection, this paper presents a novel criterion based on the statistical measurement of the curvature wavelet coefficient to dynamically select the single best basis of the quad-tree wavelet packet transformation. The selected single best basis works as an extracted feature for biometric periocular recognition system. The proposed method first extracts the mean curvature of wavelet coefficients inside the quad-tree wavelet packet transform. Second, the method finds the most distinctive features based on the largest standard deviation and dynamically selects the extracted curvature wavelet coefficients as the single best basis. Third, the selected single curvature best basis works as an extracted feature, and then it is combined with the histogram of oriented gradients method. Finally, the support vector machine is employed to perform classification. Two datasets of two-dimensional periocular digital images are tested against the proposed method. To show the extended ability, we analyze the curvature best basis method against wavelet functions and characteristics and test the proposed method against the plain face and masked face recognition. The proposed method achieves the highest performance results inside periocular recognition (97.53% accuracy for UBIPr-1 and 97.77% accuracy for EYB-P1), masked face recognition (98.11% accuracy), and plain face recognition (98.26% accuracy). The proposed method is robust against glasses occlusion, artificial geometry transformations, Gaussian and salt pepper noise. Comparison with other works in a similar recognition system shows that our proposed curvature best basis method yields the highest performance results.INDEX TERMS Best basis, curvature, periocular recognition, wavelet packet transform.

show abstract

“…Chang et al 9 added depth information to the loss function and used the change of depth value on the edge of the classified object to constrain the network training. However, there are still many problems with the above approach: Factors effecting on the one hand, the indoor noise exists in different frequency part of image more 10 , while traditional convolution neural network down sampling operations such as average pooling, maximum pooling will not separate different frequency information, which can lead to high frequency noise increased with the increase of the depth of the network are preserved, the sampling data of aliasing in the basic structure of the residual noise destroys the image features, Thus, it brings difficulties to the image segmentation task. On the other hand, the depth image is used as the fourth channel to fuse with the color image, which does not make full use of the complementarity of RGB color information and depth information.…”

Section: Introductionmentioning

confidence: 99%

Multi-scale fusion for RGB-D indoor semantic segmentation

Jiang

Yang

et al. 2022

Sci Rep

View full text Add to dashboard Cite

In computer vision, convolution and pooling operations tend to lose high-frequency information, and the contour details will also disappear with the deepening of the network, especially in image semantic segmentation. For RGB-D image semantic segmentation, all the effective information of RGB and depth image can not be used effectively, while the form of wavelet transform can retain the low and high frequency information of the original image perfectly. In order to solve the information losing problems, we proposed an RGB-D indoor semantic segmentation network based on multi-scale fusion: designed a wavelet transform fusion module to retain contour details, a nonsubsampled contourlet transform to replace the pooling operation, and a multiple pyramid module to aggregate multi-scale information and context global information. The proposed method can retain the characteristics of multi-scale information with the help of wavelet transform, and make full use of the complementarity of high and low frequency information. As the depth of the convolutional neural network increases without losing the multi-frequency characteristics, the segmentation accuracy of image edge contour details is also improved. We evaluated our proposed efficient method on commonly used indoor datasets NYUv2 and SUNRGB-D, and the results showed that we achieved state-of-the-art performance and real-time inference.

show abstract

WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for Noise-Robust Image Classification

Cited by 58 publications

References 57 publications

Metaheuristics‐optimized deep learning to predict generation of sustainable energy from rooftop plant microbial fuel cells

Metaheuristics‐optimized deep learning to predict generation of sustainable energy from rooftop plant microbial fuel cells

Curvature Best Basis: A Novel Criterion to Dynamically Select a Single Best Basis as the Extracted Feature for Periocular Recognition

Multi-scale fusion for RGB-D indoor semantic segmentation

Contact Info

Product

Resources

About