2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
DOI: 10.1109/waspaa.2019.8937271
|View full text |Cite
|
Sign up to set email alerts
|

City Classification from Multiple Real-World Sound Scenes

Abstract: The majority of sound scene analysis work focuses on one of two clearly defined tasks: acoustic scene classification or sound event detection. Whilst this separation of tasks is useful for problem definition, they inherently ignore some subtleties of the real-world, in particular how humans vary in how they describe a scene. Some will describe the weather and features within it, others will use a holistic descriptor like 'park', and others still will use unique identifiers such as cities or names. In this pape… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

1
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(18 citation statements)
references
References 12 publications
1
17
0
Order By: Relevance
“…Sound scene geotagging is a relatively novel topic of research. It means to correctly identify the geographical position where a recording was captured [1] and as such it can also be described as either audio geotagging or as a more specific task such as audio-based city classification. Subject to the precision of the dataset annotations, it could be formulated as a coarse target problem (e.g.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…Sound scene geotagging is a relatively novel topic of research. It means to correctly identify the geographical position where a recording was captured [1] and as such it can also be described as either audio geotagging or as a more specific task such as audio-based city classification. Subject to the precision of the dataset annotations, it could be formulated as a coarse target problem (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Sound scene geotagging is a task with many future realworld applications, such as automatically knowing a precise geo-location from the background sounds of an emergency call that could potentially decrease the response time [2] or it could position a person in a location with no cameras at a specific time for criminal investigations. However, the current state-of-theart in audio geotagging that uses multi-task learning, achieves only 56% accuracy [1] on city classification.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…City classification accuracy by data augmentation method with different CNN models.Accuracy (%) obtained with the multi-task CNN from[1] cyclic drop stretch cyclic+stretch all benchmark…”
mentioning
confidence: 99%