2017 25th European Signal Processing Conference (EUSIPCO) 2017
DOI: 10.23919/eusipco.2017.8081508
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional recurrent neural networks for bird audio detection

Abstract: Bird sounds possess distinctive spectral structure which may exhibit small shifts in spectrum depending on the bird species and environmental conditions. In this paper, we propose using convolutional recurrent neural networks on the task of automated bird audio detection in real-life environments. In the proposed method, convolutional layers extract high dimensional, local frequency shift invariant features, while recurrent layers capture longer term dependencies between the features extracted from short time … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
47
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 80 publications
(47 citation statements)
references
References 13 publications
0
47
0
Order By: Relevance
“…In recent years, the organizers of this challenge have managed to attract researchers from the machine learning and music information retrieval (MIR) communities [61]. This had led to the publication of new applications of existing machine learning methods to the domain of avian bioacoustics: these include multiple instance learning [62], convolutional recurrent neural networks [63], and densely connected convolutional networks [64]. Despite its undeniable merit of having gathered several data collection initiatives into a single cross-collection evaluation campaign, the methodology of the "bird detection in audio" challenge suffers from a lack of interpretability in the discussion of results post hoc.…”
Section: Evidence Of Technical Bias In State-of-the-art Bioacoustic Dmentioning
confidence: 99%
“…In recent years, the organizers of this challenge have managed to attract researchers from the machine learning and music information retrieval (MIR) communities [61]. This had led to the publication of new applications of existing machine learning methods to the domain of avian bioacoustics: these include multiple instance learning [62], convolutional recurrent neural networks [63], and densely connected convolutional networks [64]. Despite its undeniable merit of having gathered several data collection initiatives into a single cross-collection evaluation campaign, the methodology of the "bird detection in audio" challenge suffers from a lack of interpretability in the discussion of results post hoc.…”
Section: Evidence Of Technical Bias In State-of-the-art Bioacoustic Dmentioning
confidence: 99%
“…4. The reason is, the sounds of birds are usually contained in a small portion of the frequency range (mostly around 2-8 kHz) as stated in [10], so we only extract features from the range of (0, 10) kHz. In order to focus only sounds pro- † http://www.xeno-canto.org duced in the vocal organ of birds (i.e.…”
Section: Databasementioning
confidence: 99%
“…Although sound is in some case complementary to visual information, such as when we listen to something out of view, vision and hearing are often informative about the same structures in the world [8]. As a consequence, numerous efforts have been devoted to recognize bird species based on auditory data [9], [10] in recent years. Adapting CNN architectures for the purpose of audio event detection has become a common practice and generating deep features based on visual representations of audio recordings has proven to be very effective [11] such as in bird sounds [10], [12].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Contrary to the previous works, we postulate using recurrent convolutional neural networks as our feature extractor. The motivation to use this architecture stems from the fact that recurrent convolutional neural networks were successfully employed in similar acoustic modeling applications [27] and show state-of-the-art performances on many other related classification tasks, such as document [28], image [17], bird songs [18] and music genre classification [29]. Furthermore, we decided to use CQT-grams instead of raw audio signal, as an input to our recurrent convolutional neural network.…”
Section: Related Workmentioning
confidence: 99%