2011
DOI: 10.1155/2011/485738
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic Event Detection Based on Feature-Level Fusion of Audio and Video Modalities

Abstract: Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a large amount of errors, which are mostly due to temporal overlaps. Actually, temporal overlaps accounted for more than 70% of errors in the realworld interactive seminar recordings used in CLEAR 2007 evaluations. In this paper, we improve the recognition rate of acoustic events using informati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(22 citation statements)
references
References 19 publications
(22 reference statements)
0
18
0
Order By: Relevance
“…Finally, our work bears some resemblances to those exploring additional data sources (e.g. multiple channels [44], multiple modalities [45]) to augment the nonspeech audio event analysis. However, their main goal is to compensate for low signal-to-noise-ratio and overlapping signals.…”
Section: Related Workmentioning
confidence: 89%
See 2 more Smart Citations
“…Finally, our work bears some resemblances to those exploring additional data sources (e.g. multiple channels [44], multiple modalities [45]) to augment the nonspeech audio event analysis. However, their main goal is to compensate for low signal-to-noise-ratio and overlapping signals.…”
Section: Related Workmentioning
confidence: 89%
“…We used audio events of three independent datasets, including UPC-TALP [45], Freiburg-106 [16], and NAR [17], as the target nonspeech signals. These datasets are recorded in different environments, and hence, differ in reverberation characteristics.…”
Section: B Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Another area that has attracted research is focussed on the detection and classification of sounds in a meeting room environment [35][36][37][38][39]. It is also one of the only areas of SER research that has a standardised database for comparing the performance of competing systems.…”
Section: Applicationsmentioning
confidence: 99%
“…In the event detection stage, overlapping events can be detected with multiple iterative detection passes and by excluding already detected events from the following detection iterations until the desired amount of overlapping events have been reached [15]. In addition to these approaches, multiple audio signals and sound source localization methods along with video based methods can be used to better handle overlapping event in the detection [16].…”
Section: Introductionmentioning
confidence: 99%