This paper demonstrates automatic recognition of vocalizations of four common bird species (herring gull [Larus argentatus], blue jay [Cyanocitta cristata], Canada goose [Branta canadensis], and American crow [Corvus brachyrhynchos]) using an algorithm that extracts frequency track sets using track properties of importance and harmonic correlation. The main result is that a complex harmonic vocalization is rendered into a set of related tracks that is easily applied to statistical models of the actual bird vocalizations. For each vocalization type, a statistical model of the vocalization was created by transforming the training set frequency tracks into feature vectors. The extraction algorithm extracts sets of frequency tracks from test recordings that closely approximate harmonic sounds in the file being processed. Each extracted set in its final form is then compared with the statistical models generated during the training phase using Mahalanobis distance functions. If it matches one of the models closely, the recognizer declares the set an occurrence of the corresponding vocalization. The method was evaluated against a test set containing vocalizations of both the 4 target species and 16 additional species as well as background noise containing planes, cars, and various natural sounds.
The method of sound recognition relies on a transformation of a sound into a spectrogram followed by extraction of the harmonics as curves. The extracted curves are called frequency tracks. A procedure called find—feasible—sets is used to extract sets of tracks that may correspond to harmonic sounds. If a set of tracks overlap each other sufficiently in time, then the set is designated a feasible set. Following the extraction of the feasible sets, the procedure find—maximal—subsets is applied to each feasible set. This procedure uses a function called harmonic—relate that determines if two tracks are harmonically related. All tracks that are not harmonically related to any other tracks in the feasible set are discarded. Furthermore, the feasible set is divided into maximal subsets. A maximal subset is a subset of the feasible set in which every track is harmonically related to one fixed track in the set called the reference track but no other tracks in the feasible set are related to the reference track. Each frequency track in a track set is transformed into a feature vector whose components describe the frequency, slope, and shape of the track. The species of birds analyzed are bluejay and herring gull.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.