2022
DOI: 10.1109/access.2022.3222495
|View full text |Cite
|
Sign up to set email alerts
|

Research on Scattering Transform of Urban Sound Events Detection Based on Self-Attention Mechanism

Abstract: Urban sound event detection can automatically preload relevant information for the robot to ensure that it can be competent for various scene activity tasks. Aiming at the limitations of timbre similarity and scene recognition limited by audio collection devices, a fusion model based on self-attention mechanism is proposed in this paper. The model consists of scattering transform and self-attention model. Scattering transform computes modulation spectrum coefficients of multiple orders, through cascades of wav… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 53 publications
(48 reference statements)
0
6
0
Order By: Relevance
“…Cons: Unable to handle road topology structure information. Self-Attention Mechanism (Song et al, 2022 ): Pros: Weights aggregation of input at different positions, extracting important features. Cons: High computational complexity when dealing with large-scale road networks.…”
Section: Introductionmentioning
confidence: 99%
“…Cons: Unable to handle road topology structure information. Self-Attention Mechanism (Song et al, 2022 ): Pros: Weights aggregation of input at different positions, extracting important features. Cons: High computational complexity when dealing with large-scale road networks.…”
Section: Introductionmentioning
confidence: 99%
“…STFT, on the other hand, is limited by the trade-offs involved in time and frequency resolution [ 22 ]. Also, it is not stable against time-warping deformations [ 23 ]. Among the spectral features, the Mel frequency cepstrum (MFC) and the Mel frequency cepstral coefficients (MFCCs) have been used for rare event classification (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…timbral structures such as attacks, frequency and amplitude modulations, and interference in musical chords. The coefficients of the wavelet transform, on the other hand, are calculated over larger window sizes thus allowing these larger structures to be captured without loss of information [ 23 ]. The accuracy of the sound classification task is found to improve with WSC as compared to STFT and MFCC [ 25 ].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations