Urban sound event detection can automatically preload relevant information for the robot to ensure that it can be competent for various scene activity tasks. Aiming at the limitations of timbre similarity and scene recognition limited by audio collection devices, a fusion model based on self-attention mechanism is proposed in this paper. The model consists of scattering transform and self-attention model. Scattering transform computes modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators, and it is learnable compared with Mel-scale Frequency Cepstral Coefficients (MFCC), and can be used to better restore the semantic features of some sound scenes with similar timbres. Transformer has an outstanding effect on Natural Language Processing (NLP) with its selfattention mechanism. In this paper, the self-attention mechanism in its encoder is used in the model, mainly to make the feature granularity consistent to refine the features. in addition, Focal Loss function is adopted in the model to curb the problem of sample distribution imbalance. The datasets Google-Command and ESC-50 are used to supplement the scene categories of dataset UrbanSound8K. The model parameters of the learnable filters that performed well on the dataset UrbanSound8K were preserved to fine-tune the other two datasets with insufficient data volume and more target categories. The length of slice duration are further explored the on the model. Experimental results show that the model can achieve better performance in a large range of scene models.
To improve the accuracy of soil heavy metal content prediction, this paper proposes a dynamic neural network optimization model (DNNOM). The model is based on a radial basis function neural network (RBFNN). The weights and bias of the output layer of the RBFNN were generated using the adaptive dynamic genetic optimization algorithm (ADGOA), and the center point of the hidden layer of the RBFNN was determined using an efficient density peak clustering algorithm (EDPC). An adaptive variance measure (AVM) was then used to generate the width vector of RBFNN hidden layer. The model was applied to the prediction soil heavy metal content in six new urban areas in Wuhan. Through comparison with support vector machine(SVM), light gradient boosting machine(LightGBM), RBFNN, and genetic algorithm optimizes the radial basis function neural network(GA-RBFNN), the experimental results demonstrate that the DNNOM is closer to the real value than the other four models, and the four error indicator values are also significantly lower than those of the other comparison models, which have higher prediction accuracy. Especially when compared with RBFNN, the MAPE and SMAPE of DNNOM have dropped by 3.98% and 3.9%, respectively.INDEX TERMS Dynamic neural network optimization model, soil heavy metal content prediction, radial basis function neural network, adaptive dynamic genetic optimization algorithm.
To address the problem that existing deep learning methods are not sufficiently accurate to detect rice pests with changeable shapes or similar appearances, a self-attention feature fusion model for rice pest detection (SAFFPest) was proposed. The model was based on VarifocalNet. First, a deformable convolution module was added to the feature extraction network, to improve the feature extraction ability of pests with changeable shapes. Second, by obtaining the balance features of multiple feature maps, the selfattention mechanism was introduced to refine the balance feature, in order to better restore the semantic information of some pests with similar appearances. Subsequently, the group normalization method was used to replace the batch normalization method in the original model, to reduce the impact of batch size on model training. The IP102 rice pest dataset was used to train and verify this model. The experimental results showed that the model can accurately detect nine kinds of rice pests, such as rice leaf rollers and rice leaf caterpillars. Compared with FasterRCNN, RetinaNet, CP-FCOS, VFNet and BiFA-YOLO, the mean average precision of the model improved by 33.7%, 6.5%, 4.5%, 2.9% and 2% respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.