In recent years, biometric recognition patterns have attracted the attention of many researchers, among which human ears, as a unique and stable biometric feature, have significant advantages in verifying personal identity. In the Internet era, a system with low computing cost and good real-time performance is more popular. Most of the existing ear recognition methods are based on a large parameter network model, which causes a large memory footprint and computational overhead. This paper proposes an efficient and lightweight human ear recognition method (ELERNet) based on MobileNet V2. Based on the MobileNet V2 model, dynamic convolution decomposition is introduced to enhance the representation ability of human ear features. Then, combined with the coordinate attention mechanism, the spatial features of human ear images are aggregated to locate the location information of the human ear features more accurately. We conducted experiments on AWE and EarVN1.0 human ear datasets. Compared with the MobileNet V2 model, the recognition accuracy of our method is significantly improved. Using less computing hardware resources, the ELERNet model achieves 83.52% and 96.10% Rank-1 (R1) recognition accuracy, respectively, which is better than other models. Finally, we provide a visual interpretation using GradCAM technology, and the results show that our method can learn specific and discriminative features in the ear images.
Facial expression recognition technology has become a powerful tool for conveying human emotions and intentions and is widely used in areas such as assisted driving and intelligent medical care. Due to the limited computing power of current hardware devices and the real‐time requirements of application scenarios, this paper proposes a high‐performance and lightweight framework for real‐time facial expression recognition framework to solve the problem of real‐time completion of expression recognition tasks under low hardware costs. To address these issues, this paper first designs a RepVGG and mobileNetV2 dual‐channel structure in the feature extraction. It is then input into the MobileViT Block for global feature modelling. Finally, the position vector of the capsule network is used to replace the output of the global pooling, preserving the spatial relationship of the salient features and enhancing the classification effect. Compared with the mainstream facial expression recognition algorithm that cannot get good classification results under low complexity conditions, the model has a significant accuracy improvement while ensuring lightweight. With only 294.60M FLOPS and 0.95M parameters, it achieved an accuracy of 97.53% on the KDEF dataset and 85.56% on the RAF‐DB, demonstrating the advanced nature of the algorithm.
Due to differences in the distribution of scores for different trials, the performance of a speaker verification system will be seriously diminished if raw scores are directly used for detection with a unified threshold value. As such, the scores must be normalized. To tackle the shortcomings of score normalization methods, we propose a speaker verification system based on log-likelihood normalization (LLN). Without a priori knowledge, LLN increases the separation between scores of target and non-target speaker models, so as to improve score aliasing of "same-speaker" and "different-speaker" trials corresponding to the same test speech, enabling better discrimination and decision capability. The experiment shows that LLN is an effective method of scoring normalization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.