A Discriminative Deep Model With Feature Fusion and Temporal Attention for Human Action Recognition

Yu, Jiahui; Gao, Hongwei; Yang, Wei; Jiang, Yueqiu; Chin, Wei Hong; Kubota, Naoyuki; Ju, Zhaojie

doi:10.1109/access.2020.2977856

Cited by 38 publications

(17 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In future work, we will improve the feature extraction by introducing the multiscale features segmentation method and the bionic mechanism, such as [42], [43] and [44]. Besides, we will collect more dataset with data preprocessing, which is expected to be a benchmark dataset for the metal surface defects detection.…”

Section: Discussionmentioning

confidence: 99%

Automatic and Efficient Metallic Surface Defect Detection Based on Key Pixel Point Locations

Gao

Sun

et al. 2021

IEEE Sensors J.

Self Cite

View full text Add to dashboard Cite

Surface defect detection aims to accurately recognize and distinguish types of defects and plays a key role in many applications. However, most of the recent studies focus on specific scenario detection and do not fairly consider the balance between the speed and accuracy. In the paper, we propose a key pixel points location-oriented method to identify multiscale defects, with several important properties: 1) A real-time template matching-based model is designed to speed up the process by introducing the Gaussian operator; 2) An improved Hough-based model is used to achieve a higher detection precision by deep mining both incremental properties and parallel properties; 3) An adaptive filtering-based image preprocessing method is proposed to eliminate the interference of multiple types of clutters and noises. In the experiments, a mean average rate of 96% was achieved to detect and classify four types of common defects and the average time was reduced to 0.149s. Furthermore, we fully evaluate the proposed method on two public datasets collected in real production lines and compare the results with other state-of-the-art methods. The results show that the proposed method achieved better balanced performance in many real application scenarios.

show abstract

Section: Discussionmentioning

confidence: 99%

Automatic and Efficient Metallic Surface Defect Detection Based on Key Pixel Point Locations

Gao

Sun

et al. 2021

IEEE Sensors J.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Hence, many deep learning methods are proposed for facial expression recognition and achieve state-of-the-art results. Many deep networks are very effective for key feature extraction, such as the EmotiW, the DBNs, the AUDNs, the E3D-LSTM, 3D-MM, and the D-ConvLSTM [38]- [43]. Deep learning-based models focus on the following research aspects: 1) Improve the deep model by re-designing the network structure and random weight initialization; 2) Mining more discriminative facial features; 3) Reduce the dependence of the deep model on a large number of data samples; 4) Publish datasets that have highquality labels and are collected under actual world.…”

Section: B Feature Extractionmentioning

confidence: 99%

Facial Expression Recognition Using Pose-Guided Face Alignment and Discriminative Features Based on Deep Learning

Feng

Wang

2021

IEEE Access

View full text Add to dashboard Cite

Face expression recognition is a key technology of robot vision, which can help the robotic understand human emotions. However, interference from the real-world, such as light changes, face occlusion, and pose variation, reduces the recognition rate of the model. To solve above problems, in this paper, a novel deep model is proposed to improve the classification accuracy of facial expressions. The proposed model has the following merits: 1) A pose-guided face alignment method is proposed to reduce the intra-class difference, which can overcome the impact of environmental noise; 2) A hybrid feature representation method is proposed to obtain high-level discriminative facial features that achieves better results in classification networks; 3) A lightweight fusion backbone is designed, which combines the VGG-16 and the ResNet to achieve low-data and low-calculation training. Finally, to evaluate the proposed model, we conduct a series of experiments on four benchmark datasets, including the CK+, the JAFFE, the Oulu-CASIA, and the AR. The results show that the proposed model achieves state-of-the-art recognition rates, that is, 98.9%, 96.8%, 94.5%, and 98.7%, respectively. Comparing with the traditional methods and other advanced deep learning methods, the proposed model can comparable performance in a variety of tasks.

show abstract

“…DenseNet also made the network narrower, the number of parameters compared to other models was significantly reduced, and the training efficiency is improved at the same time. In 2020, J. Yu et al [27] used improved Inception ResNet layers for automatic recognition, which further improved the performance of the DCNN algorithm, but it could not merge the multilevel and multiscale features. Thus, we proposed the CFPN architecture to further improve the performance of the FER algorithm.…”

Section: Convolutional Neural Network For Classificationmentioning

confidence: 99%

“…However, end-to-end learning tasks often require a large dataset, which poses a great challenge to the field of expression recognition lacking label data; additional depth information could potentially improve the performance. These aspects will be the main focus in the future research [40], [41]. Alexnet [39] 0.243178s 0.74 Inception v2 [22] 0.565114s 0.81…”

Section: Conclusion and Feature Workmentioning

confidence: 99%

A Cascaded Feature Pyramid Network With Non-Backward Propagation for Facial Expression Recognition

et al. 2021

Self Cite

View full text Add to dashboard Cite

In this work we propose a novel cascaded feature pyramid network with non-backward propagation (CFPN-NBP) for facial expression recognition (FER) that addresses the problems inherent in traditional backward propagation (BP) algorithms in the training process by using the Hilbert-Schmidt independence criterion (HSIC) bottleneck. The proposed algorithm is developed at two different levels. At the first level, a novel training method HSIC bottleneck is considered as an alternative to traditional BP optimization, where the correlation between the output of the hidden layers and the input, and the correlation between the output of the hidden layers and its label are calculated to reduce redundant information; hence, the least information is used to predict the results. At the second level, a novel architecture is designed in the feature extraction process. The convolutional layers with the same resolutions are densely connected and introduced into the attention mechanism, so that the model can focus on more important information. The convolutional layers with different resolutions are combined by three cascaded pyramid networks; in this way, the shallow features and the deep features can be further fused, and; therefore, the semantic information and the content information can both be reserved. To further reduce the number of parameters, the operation of separable convolution instead of traditional convolution is utilized. Experiments on the challenging FER2013 dataset show that the proposed CFPN-NBP algorithm improves the accuracy of the FER task and outperforms the related state-of-the-art methods.

show abstract

A Discriminative Deep Model With Feature Fusion and Temporal Attention for Human Action Recognition

Cited by 38 publications

References 56 publications

Automatic and Efficient Metallic Surface Defect Detection Based on Key Pixel Point Locations

Automatic and Efficient Metallic Surface Defect Detection Based on Key Pixel Point Locations

Facial Expression Recognition Using Pose-Guided Face Alignment and Discriminative Features Based on Deep Learning

A Cascaded Feature Pyramid Network With Non-Backward Propagation for Facial Expression Recognition

Contact Info

Product

Resources

About