Given that facial features contain a wide range of identification information and cannot be completely represented by a single feature, the fusion of multiple features is particularly significant for achieving a robust face recognition performance, especially when there is a big difference between the test sets and the training sets. This has been proven in both traditional and deep learning approaches. In this work, we proposed a novel method named C2D-CNN (color 2-dimensional principal component analysis (2DPCA)-convolutional neural network). C2D-CNN combines the features learnt from the original pixels with the image representation learnt by CNN, and then makes decision-level fusion, which can significantly improve the performance of face recognition. Furthermore, a new CNN model is proposed: firstly, we introduce a normalization layer in CNN to speed up the network convergence and shorten the training time. Secondly, the layered activation function is introduced to make the activation function adaptive to the normalized data. Finally, probabilistic max-pooling is applied so that the feature information is preserved to the maximum extent while maintaining feature invariance. Experimental results show that compared with the state-of-the-art method, our method shows better performance and solves low recognition accuracy caused by the difference between test and training datasets.
The identification which uses biological characteristics has been a current top in the recent past. However, numerous spoofing skills occur with the rising prosperity of advance recognition technology, especially in the detection and recognition of a face. In allusion to the problem above, more robust and accurate face spoofing detection schemes have been put forward. Convolutional neural networks (CNNs) have demonstrated extraordinary success in face liveness detection recently. In this study, an effective face anti-spoofing detection method based on CNN and rotation invariant local binary patterns (RI-LBP) has been proposed. First, the authors use CNN to extract deep features and use RI-LBP to extract colour texture features. In addition, the principal component analysis approach is employed to decrease the dimensions of deep characteristic. Moreover, two different features are fused before applying to support vector machine (SVM). Finally, the SVM classifier is adopted to identify genuine faces from fake faces. They have conducted extensive experiments to obtain a scheme of better generalisation capability for face anti-spoofing detection. The analysis results indicate that the proposed approach implements great generalisation capability over other state-of-the-art approaches within the intra-databases and cross-databases.
Medical image quality requirements have been increasingly stringent with the recent developments of medical technology. To meet clinical diagnosis needs, an effective medical image enhancement method based on convolutional neural networks (CNNs) and frequency band broadening (FBB) is proposed. Curvelet transform is used to deal with medical data by obtaining the curvelet coefficient in each scale and direction, and the generalised cross‐validation is implemented to select the optimal threshold for performing denoising processing. Meanwhile, the cycle spinning scheme is used to wipe off the visible ringing effects along the edges of medical images. Then, FBB and a new CNN model based on the retinex model are used to improve the processed image resolution. Eventually, pixel‐level fusion is made between two enhanced medical images from CNN and FBB. In the authors’ study, 50 groups of medical magnetic resonance imaging, X‐ray, and computed tomography images in total have been studied. The experimental results indicate that the final enhanced image using the proposed method outperforms other methods. The resolution and the edge details of the processed image are significantly enhanced, providing a more effective and accurate basis for medical workers to diagnose diseases.
In order to solve the problem of face recognition in complex environments being vulnerable to illumination change, object rotation, occlusion, and so on, which leads to the imprecision of target position, a face recognition algorithm with multi-feature fusion is proposed. This study presents a new robust face-matching method named SR-CNN, combining the rotation-invariant texture feature (RITF) vector, the scale-invariant feature transform (SIFT) vector, and the convolution neural network (CNN). Furthermore, a graphics processing unit (GPU) is used to parallelize the model for an optimal computational performance. The Labeled Faces in the Wild (LFW) database and self-collection face database were selected for experiments. It turns out that the true positive rate is improved by 10.97–13.24% and the acceleration ratio (the ratio between central processing unit (CPU) operation time and GPU time) is 5–6 times for the LFW face database. For the self-collection, the true positive rate increased by 12.65–15.31%, and the acceleration ratio improved by a factor of 6–7.
The convolutional neural network (CNN) has made great strides in the area of voiceprint recognition; but it needs a huge number of data samples to train a deep neural network. In practice, it is too difficult to get a large number of training samples, and it cannot achieve a better convergence state due to the limited dataset. In order to solve this question, a new method using a deep migration hybrid model is put forward, which makes it easier to realize voiceprint recognition for small samples. Firstly, it uses Transfer Learning to transfer the trained network from the big sample voiceprint dataset to our limited voiceprint dataset for the further training. Fully-connected layers of a pre-training model are replaced by restricted Boltzmann machine layers. Secondly, the approach of Data Augmentation is adopted to increase the number of voiceprint datasets. Finally, we introduce fast batch normalization algorithms to improve the speed of the network convergence and shorten the training time. Our new voiceprint recognition approach uses the TLCNN-RBM (convolutional neural network mixed restricted Boltzmann machine based on transfer learning) model, which is the deep migration hybrid model that is used to achieve an average accuracy of over 97%, which is higher than that when using either CNN or the TL-CNN network (convolutional neural network based on transfer learning). Thus, an effective method for a small sample of voiceprint recognition has been provided.
Identifying drug−protein interactions (DPIs) is crucial in drug discovery, and a number of machine learning methods have been developed to predict DPIs. Existing methods usually use unrealistic data sets with hidden bias, which will limit the accuracy of virtual screening methods. Meanwhile, most DPI prediction methods pay more attention to molecular representation but lack effective research on protein representation and high-level associations between different instances. To this end, we present the novel structure-aware multimodal deep DPI prediction model, STAMP-DPI, which was trained on a curated industry-scale benchmark data set. We built a high-quality benchmark data set named GalaxyDB for DPI prediction. This industry-scale data set along with an unbiased training procedure resulted in a more robust benchmark study. For informative protein representation, we constructed a structure-aware graph neural network method from the protein sequence by combining predicted contact maps and graph neural networks. Through further integration of structure-based representation and high-level pretrained embeddings for molecules and proteins, our model effectively captures the feature representation of the interactions between them. As a result, STAMP-DPI outperformed state-of-the-art DPI prediction methods by decreasing 7.00% mean square error (MSE) in the Davis data set and improving 8.89% area under the curve (AUC) in the GalaxyDB data set. Moreover, our model is an interpretable model with the transformer-based interaction mechanism, which can accurately reveal the binding sites between molecules and proteins.
Small-object detection is a basic and challenging problem in computer vision tasks. It is widely used in pedestrian detection, traffic sign detection, and other fields. This paper proposes a deep learning smallobject detection method based on image super-resolution to improve the speed and accuracy of small-object detection. First, we add a feature texture transfer (FTT) module at the input end to improve the image resolution at this end as well as to remove the noise in the image. Then, in the backbone network, using the Darknet53 framework, we use dense blocks to replace residual blocks to reduce the number of network structure parameters to avoid unnecessary calculations. Then, to make full use of the features of small targets in the image, the neck uses a combination of SPPnet and PANnet to complete this part of the multi-scale feature fusion work. Finally, the problem of image background and foreground imbalance is solved by adding the foreground and background balance loss function to the YOLOv4 loss function part. The results of the experiment conducted using our self-built dataset show that the proposed method has higher accuracy and speed compared with the currently available small-target detection methods.INDEX TERMS Small-object detection, image super-resolution, dense block, foreground and background, balance loss function, multi-scale feature fusion
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.