2019
DOI: 10.3390/acoustics1020023
|View full text |Cite
|
Sign up to set email alerts
|

An Enhanced Temporal Feature Integration Method for Environmental Sound Recognition

Abstract: Temporal feature integration refers to a set of strategies attempting to capture the information conveyed in the temporal evolution of the signal. It has been extensively applied in the context of semantic audio showing performance improvements against the standard frame-based audio classification methods. This paper investigates the potential of an enhanced temporal feature integration method to classify environmental sounds. The proposed method utilizes newly introduced integration functions that capture the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0
3

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(18 citation statements)
references
References 16 publications
0
14
0
3
Order By: Relevance
“…Context-and location-aware services can be combined with (multichannel) semantic processing to offer spatiotemporal sound mapping and pattern-related visualizations. Such featured summarization techniques are encountered on generic audio detection and classification tasks, including environmental sound recognition [14,[49][50][51][52][53][54][55][56][57][58][59]. In this view, crowdsourced audio data can offer soundscape enhancement with multiple augmentation layers in favor of documentation, data-driven storytelling, and management.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Context-and location-aware services can be combined with (multichannel) semantic processing to offer spatiotemporal sound mapping and pattern-related visualizations. Such featured summarization techniques are encountered on generic audio detection and classification tasks, including environmental sound recognition [14,[49][50][51][52][53][54][55][56][57][58][59]. In this view, crowdsourced audio data can offer soundscape enhancement with multiple augmentation layers in favor of documentation, data-driven storytelling, and management.…”
Section: Related Workmentioning
confidence: 99%
“…In this view, crowdsourced audio data can offer soundscape enhancement with multiple augmentation layers in favor of documentation, data-driven storytelling, and management. The massive research progress on the domain has established multiple pattern recognition schemes and hierarchical semantic audio taxonomies to describe the sound-fields associated with the different social events [13][14][15][16][17][18][19][20][21][22][23][24][52][53][54][55][56][57][58][59]. Apart from the geographical-and time-related information that a mobile terminal can easily hold, environmental sounds and soundscapes can be classified, filtered, and highlighted based on the associated pattern classification taxonomies, various low-level audio descriptors, other semantic labels concerning the transmitted or perceived emotions, etc.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In this context, new audio recognition and semantic analysis techniques are deployed for General Audio Detection and Classification (GADC) tasks, which are very useful in many multidisciplinary domains [4][5][6][7][8][9][10][11][12][13][14][15][16]. Typical examples include speech recognition and perceptual enhancement [5][6][7][8], speaker indexing and diarization [14][15][16][17][18][19], voice/music detection and discrimination [1][2][3][4][9][10][11][12][13][20][21][22], information retrieval and genre classification of music [23,24], audio-driven alignment of multiple recordings [25,26], sound emotion recognition [27][28][29] and others [10,[30][31][32]. Concerning the media production and broadcasting domain, audio and audio-driven segmentation allow for the implementation of prope...…”
Section: Introductionmentioning
confidence: 99%