2014
DOI: 10.1007/s00530-014-0424-7
|View full text |Cite
|
Sign up to set email alerts
|

Context-based environmental audio event recognition for scene understanding

Abstract: To the best of our knowledge, this is the first work that models event correlations as scene context for robust audio event detection from complex and noisy environments. Note that according to the recent report, the mean accuracy for the acoustic scene classification task by human listeners is only around 71 % on the data collected in office environments from the DCASE dataset. None of the existing methods performs well on all scene categories and the average accuracy of the best performances of the recent 11… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 37 publications
0
2
0
Order By: Relevance
“…Automatic Audio Captioning (AAC) is an inter-modal translation task, where the objective is to generate a textual description for a corresponding input audio signal [1]. Audio captioning is a critical step towards machine intelligence and many applications in daily scenarios, such as audio retrieval [2], scene understanding [3] [4], applications for the hearing impaired patients [5], detailed audio surveillance etc. Unlike an Automatic Speech Recognition (ASR) task, the output is a description rather than a transcription of the contents within the audio sample.…”
Section: Introductionmentioning
confidence: 99%
“…Automatic Audio Captioning (AAC) is an inter-modal translation task, where the objective is to generate a textual description for a corresponding input audio signal [1]. Audio captioning is a critical step towards machine intelligence and many applications in daily scenarios, such as audio retrieval [2], scene understanding [3] [4], applications for the hearing impaired patients [5], detailed audio surveillance etc. Unlike an Automatic Speech Recognition (ASR) task, the output is a description rather than a transcription of the contents within the audio sample.…”
Section: Introductionmentioning
confidence: 99%
“…Automatic Audio Captioning (AAC) is an inter-modal translation task, where the objective is to generate a textual description for a corresponding input audio signal [2]. Audio captioning is a critical step towards machine intelligence with multiple applications in daily scenarios, ranging from audio retrieval [3], scene understanding [4,5] to assist the hearing impaired [6] and audio surveillance. Unlike an Automatic Speech Recognition (ASR) task, the output is a description rather than a transcription of the linguistic content in the audio sample.…”
Section: Introductionmentioning
confidence: 99%