City Classification from Multiple Real-World Sound Scenes

Bear, Helen L.; Heittola, Toni; Mesaros, Annamaria; Benetos, Emmanouil; Virtanen, Tuomas

doi:10.1109/waspaa.2019.8937271

Cited by 8 publications

(18 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Sound scene geotagging is a relatively novel topic of research. It means to correctly identify the geographical position where a recording was captured [1] and as such it can also be described as either audio geotagging or as a more specific task such as audio-based city classification. Subject to the precision of the dataset annotations, it could be formulated as a coarse target problem (e.g.…”

Section: Introductionmentioning

confidence: 99%

“…Sound scene geotagging is a task with many future realworld applications, such as automatically knowing a precise geo-location from the background sounds of an emergency call that could potentially decrease the response time [2] or it could position a person in a location with no cameras at a specific time for criminal investigations. However, the current state-of-theart in audio geotagging that uses multi-task learning, achieves only 56% accuracy [1] on city classification.…”

Section: Introductionmentioning

confidence: 99%

“…Prior to the work in [1], there was little investigation of audio geotagging, and to the best of our knowledge, not as a classification task. Examples by Elizalde et al [6] and Kumar et al [7] used sound scenes generated with sound events not specific to the locations.…”

Section: Introductionmentioning

confidence: 99%

“…In [1], the authors use a dataset labelled for both audio geotagging (in a city level) and acoustic scene classification. As such, the annotations function as a many-to-many connection between the city and scene classes.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

An evaluation of data augmentation methods for sound scene geotagging

Bear

Morfi

Benetos

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Sound scene geotagging is a new topic of research which has evolved from acoustic scene classification. It is motivated by the idea of audio surveillance. Not content with only describ?ing a scene in a recording, a machine that can locate wherethe recording was captured would be of use to many. In this paper we explore a series of common audio data augmentation methods to evaluate which best improves the accuracy of audio geotagging classifiers. Our work improves on the state-of-the-art city geotagging method by 23% in terms of classification accuracy.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

An evaluation of data augmentation methods for sound scene geotagging

Bear

Morfi

Benetos

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…City classification accuracy by data augmentation method with different CNN models.Accuracy (%) obtained with the multi-task CNN from[1] cyclic drop stretch cyclic+stretch all benchmark…”

mentioning

confidence: 99%

An Evaluation of Data Augmentation Methods for Sound Scene Geotagging

Bear¹,

Morfi²,

Benetos³

2021

Interspeech 2021

Self Cite

View full text Add to dashboard Cite

Sound scene geotagging is a new topic of research which has evolved from acoustic scene classification. It is motivated by the idea of audio surveillance. Not content with only describing a scene in a recording, a machine which can locate where the recording was captured would be of use to many. In this paper we explore a series of common audio data augmentation methods to evaluate which best improves the accuracy of audio geotagging classifiers.Our work improves on the state-of-the-art city geotagging method by 23% in terms of classification accuracy.

show abstract