Bioacoustic research spans a wide range of biological questions and applications, relying on identification of target species or smaller acoustic units, such as distinct call types. However, manually identifying the signal of interest is time-intensive, error-prone, and becomes unfeasible with large data volumes. Therefore, machine-driven algorithms are increasingly applied to various bioacoustic signal identification challenges. Nevertheless, biologists still have major difficulties trying to transfer existing animal- and/or scenario-related machine learning approaches to their specific animal datasets and scientific questions. This study presents an animal-independent, open-source deep learning framework, along with a detailed user guide. Three signal identification tasks, commonly encountered in bioacoustics research, were investigated: (1) target signal vs. background noise detection, (2) species classification, and (3) call type categorization. ANIMAL-SPOT successfully segmented human-annotated target signals in data volumes representing 10 distinct animal species and 1 additional genus, resulting in a mean test accuracy of 97.9%, together with an average area under the ROC curve (AUC) of 95.9%, when predicting on unseen recordings. Moreover, an average segmentation accuracy and F1-score of 95.4% was achieved on the publicly available BirdVox-Full-Night data corpus. In addition, multi-class species and call type classification resulted in 96.6% and 92.7% accuracy on unseen test data, as well as 95.2% and 88.4% regarding previous animal-specific machine-based detection excerpts. Furthermore, an Unweighted Average Recall (UAR) of 89.3% outperformed the multi-species classification baseline system of the ComParE 2021 Primate Sub-Challenge. Besides animal independence, ANIMAL-SPOT does not rely on expert knowledge or special computing resources, thereby making deep-learning-based bioacoustic signal identification accessible to a broad audience.
This is an open access article under the terms of the Creat ive Commo ns Attri butio n-NonCo mmerc ial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Geographic differences in vocal dialects have provided strong evidence for animal culture, with patterns likely arising from generations of social learning and transmission. The current knowledge on the evolution of vocal dialects has predominantly focused on fixed repertoire, territorial song. The study of vocal dialects in open-ended flexible learners and in contexts where vocalisations serve other functions is therefore necessary for a more comprehensive understanding of vocal dialect evolution. Parrots are open-ended vocal learners where vocalisations are used for social contact and coordination. Here, we recorded monk parakeets (Myiopsitta monachus) across multiple spatial and social scales in their European invasive range. We then compared calls using a multi-level Bayesian model and sensitivity analysis, with this novel approach allowing us to explicitly compare dialects at multiple spatial scales, even in unmarked populations. We found support for founder effects and/or cultural drift at the city level, consistent with passive cultural processes leading to large scale dialect differences. We did not find a strong signal for dialect differences between groups within cities, suggesting groups did not actively converge on a group level signal, as expected under the group membership hypothesis. We demonstrate how our sensitivity analysis highlights the robustness of the results and offer an explanation that unifies the results of prior monk parakeet dialect studies.
1. To better understand how vocalisations are used during interactions of multiple individuals, studies are increasingly deploying on-board devices with a microphone on each animal. The resulting recordings are challenging to analyse, since microphone clocks drift non-linearly and record the vocalisations of non-focal individuals as well as noise. 2. Here we present `callsync`, an R package designed to align recordings, detect and assign vocalisations to the caller, trace the fundamental frequency, filter out noise and perform basic analysis on the resulting clips. 3. We present a case study where the pipeline is used on a new dataset of six captive cockatiels (*Nymphicus hollandicus*) wearing backpack microphones. Recordings initially had drift of ~2 minutes, but were aligned up to ~2 seconds with our package. We detected and assigned 970 calls across two 3.5 hours recording sessions. We then used a function that traces the fundamental frequency and applied spectrographic cross correlation to show that calls coming from the same individual sound more similar. 4. The `callsync` package can be used to go from raw recordings to a clean dataset of features. The package is designed to be modular and allows users to replace functions as they wish. We also discuss the challenges that might be faced in each step and how the available literature can provide alternatives for each step.
The U.S. Army Engineer Research and Development Center (ERDC) solves the nation's toughest engineering and environmental challenges. ERDC develops innovative solutions in civil and military engineering, geospatial sciences, water resources, and environmental sciences for the Army, the Department of Defense, civilian agencies, and our nation's public good. Find out more at www.erdc.usace.army.mil. To search for other technical reports published by ERDC, visit the ERDC online library at http://acwc.sdp.sirsi.net/client/default.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.