ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054712
|View full text |Cite
|
Sign up to set email alerts
|

Metric Learning with Background Noise Class for Few-Shot Detection of Rare Sound Events

Abstract: Few-shot learning systems for sound event recognition gain interests since they require only a few examples to adapt to new target classes without fine-tuning. However, such systems have only been applied to chunks of sounds for classification or verification. In this paper, we aim to achieve few-shot detection of rare sound events, from long query sequence that contain not only the target events but also the other events and background noise. Therefore, it is required to prevent false positive reactions to bo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 28 publications
0
8
0
Order By: Relevance
“…Five different few-shot learning methods [21,[31][32][33][34], improved with an attentional similarity module to detect transient events, are applied to sound event recognition in [35]. The effectiveness of few-shot techniques in sound event detection has led to the development of strategies to extend their application to increasingly challenging tasks, such as multi-label classification [36], rare sound event detection [37], continual learning [38], unsupervised and semi-supervised learning approaches [39], and sound localization [40].…”
Section: Related Workmentioning
confidence: 99%
“…Five different few-shot learning methods [21,[31][32][33][34], improved with an attentional similarity module to detect transient events, are applied to sound event recognition in [35]. The effectiveness of few-shot techniques in sound event detection has led to the development of strategies to extend their application to increasingly challenging tasks, such as multi-label classification [36], rare sound event detection [37], continual learning [38], unsupervised and semi-supervised learning approaches [39], and sound localization [40].…”
Section: Related Workmentioning
confidence: 99%
“…information extraction [211], matchine translation [100], charge prediction [178], sequence labeling [349] Audio&Speech audio/speech/sound classification [350], [351], [352], [353], [354], [355], text-to-speech [356], [357], [358], [359], acoustic/sound event detection [360], [361], [362], speech generation [350], [363], keyword/command recognition [364], keyword spotting [365], human-fall detection [366], speaker recognition [367],…”
Section: Applicationsmentioning
confidence: 99%
“…Recently, [11] proposes two pretext tasks, namely, estimating the time distance between pairs of audio segments, and reconstructing a spectrogram patch from past and future patches. In the last few years, self-supervised learning methods using contrastive losses have gained increasing attention, not only for images [9,10], but also for speech [13,14,15], and sound events [16]. In the context of contrastive learning, a recent trend is to learn representations by contrasting different versions or views of the same data example, computed via data augmentation.…”
Section: Introductionmentioning
confidence: 99%