2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012
DOI: 10.1109/icassp.2012.6287923
|View full text |Cite
|
Sign up to set email alerts
|

Audio event detection from acoustic unit occurrence patterns

Abstract: In most real-world audio recordings, we encounter several types of audio events. In this paper, we develop a technique for detecting signature audio events, that is based on identifying patterns of occurrences of automatically learned atomic units of sound, which we call Acoustic Unit Descriptors or AUDs. Experiments show that the methodology works as well for detection of individual events and their boundaries in complex recordings.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 35 publications
(30 citation statements)
references
References 10 publications
0
29
0
Order By: Relevance
“…Number of codebooks and words per codebook: Although increasing the number of codebooks (N ∈ [1,3,9]) or the number of words extracted per codebooks (K ∈ [1,3]) increases the number of combinations and should have the same effect as increasing the size of the codebooks, we observe that recall tends to increase, and precision slightly dropps. We believe that using several words per codebook, i.e., K = 3, drastically improves the description capabilities of audio words.…”
Section: B Study On the Parametersmentioning
confidence: 87%
See 2 more Smart Citations
“…Number of codebooks and words per codebook: Although increasing the number of codebooks (N ∈ [1,3,9]) or the number of words extracted per codebooks (K ∈ [1,3]) increases the number of combinations and should have the same effect as increasing the size of the codebooks, we observe that recall tends to increase, and precision slightly dropps. We believe that using several words per codebook, i.e., K = 3, drastically improves the description capabilities of audio words.…”
Section: B Study On the Parametersmentioning
confidence: 87%
“…The approach followed in [9] is the closest to ours. In this article, the 2011 TRECVID Multimedia Event Detection (MED) data [10] is used to characterize user generated video excerpts coming from Internet and to detect audio events for which annotations are provided.…”
Section: Background On Audio Wordsmentioning
confidence: 99%
See 1 more Smart Citation
“…It is a general framework for obtaining a fixed length representations for audio clips and can be done on a variety of low-level audio features such as MFCCs [24], autoencoder based features [3] and normalized spectral features [21] to name a few. An alternate approach to obtaining bags of words is used in [18] -sound recordings are first decomposed into sequence of basic sound units called "Acoustic Unit Descriptors" (AUDS), which are themselves learned in an unsupervised manner. Bags of words are then obtained as bags of AUDs.…”
Section: Related Workmentioning
confidence: 99%
“…The ensemble-based learning approaches such as random forest and density forest [6] have been successfully employed for several tasks in the audio domain such as emotion recognition [19,18], paralinguistic event detection [1] and audio event detection [11]. In this work, a segmentation forest is utilised as a special case of the random forest approach.…”
Section: Bic-based Speaker Segmentation Using Segmentation Forestmentioning
confidence: 99%