Bird and whale species identification using sound images

Nanni, Loris; Aguiar, Rafael de Lima; Costa, Yandre M. G.; Brahnam, Sheryl; Silla, Carlos N.; Brattin, Ricky L.; Zhao, Zhao

doi:10.1049/iet-cvi.2017.0075

Cited by 13 publications

(8 citation statements)

References 54 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The performance of Convolutional Neural Network is optimized for image classification, so this method is not suitable for the feature set we used in this paper. We also used Recurrent Neural Network in the pilot test, but our DNN method showed better results [10,11].…”

Section: Introductionmentioning

confidence: 99%

Classification of Heart Sound Signal Using Multiple Features

2018

View full text Add to dashboard Cite

Cardiac disorders are critical and must be diagnosed in the early stage using routine auscultation examination with high precision. Cardiac auscultation is a technique to analyze and listen to heart sound using electronic stethoscope, an electronic stethoscope is a device which provides the digital recording of the heart sound called phonocardiogram (PCG). This PCG signal carries useful information about the functionality and status of the heart and hence several signal processing and machine learning technique can be applied to study and diagnose heart disorders. Based on PCG signal, the heart sound signal can be classified to two main categories i.e., normal and abnormal categories. We have created database of 5 categories of heart sound signal (PCG signals) from various sources which contains one normal and 4 are abnormal categories. This study proposes an improved, automatic classification algorithm for cardiac disorder by heart sound signal. We extract features from phonocardiogram signal and then process those features using machine learning techniques for classification. In features extraction, we have used Mel Frequency Cepstral Coefficient (MFCCs) and Discrete Wavelets Transform (DWT) features from the heart sound signal, and for learning and classification we have used support vector machine (SVM), deep neural network (DNN) and centroid displacement based k nearest neighbor. To improve the results and classification accuracy, we have combined MFCCs and DWT features for training and classification using SVM and DWT. From our experiments it has been clear that results can be greatly improved when Mel Frequency Cepstral Coefficient and Discrete Wavelets Transform features are fused together and used for classification via support vector machine, deep neural network and k-neareast neighbor(KNN). The methodology discussed in this paper can be used to diagnose heart disorders in patients up to 97% accuracy. The code and dataset can be accessed at “https://github.com/yaseen21khan/Classification-of-Heart-Sound-Signal-Using-Multiple-Features-/blob/master/README.md”.

show abstract

Section: Introductionmentioning

confidence: 99%

Classification of Heart Sound Signal Using Multiple Features

2018

View full text Add to dashboard Cite

show abstract

“…Manual segmentation of audio recordings is time consuming and tedious task. However, some approaches [5][6][7][8] have adopted manual segmentation while others have adopted automated segmentation of birdsong [9][10][11][12]. In general, detect any acoustic activity in audio and segment it as a syllable.…”

Section: Related Workmentioning

confidence: 99%

“…EE represents the Entropy of the energy that is used to measure the changes in the level of energy of birdsong. On the other hand, Equation (6) represents the FDFs that provide the spectral features of the birdsong, e.g., SE represents the spectral entropy, and F represents the spectral flux. The spectral shape is represented by spectral spread (S) and spectral centroid (C).…”

Section: Perceptual Descriptive and Harmonic Features (Pdhfs)mentioning

confidence: 99%

Automatic Classification of Monosyllabic and Multisyllabic Birds Using PDHF

et al. 2021

View full text Add to dashboard Cite

Bioacoustics plays an important role in the conservation of bird species. Bio-acoustic surveys based on autonomous audio recording are both cost-effective and time-efficient. However, there are many bird species with different patterns of vocalization, and it is a challenging task to deal with them. Previous studies have revealed that many authors focus on the segmentation of bird audio without considering specific patterns of bird vocalization. Based on the existing literature, currently there is no work on the segmentation of monosyllabic and multisyllabic birds, separately. Therefore, this research addresses the aforementioned concern and also proposes a collection of audio features named ‘Perceptual, Descriptive, and Harmonic Features (PDHFs)’ that gives promising results in the classification of bird vocalization. Moreover, the classification results improved when monosyllabic and multisyllabic birds were classified separately. To analyze the performance of PDHFs, different classifiers were used in which Artificial neural network (ANN) outperformed other classifiers and demonstrated an accuracy of 98%.

show abstract

“…all their samples recorded in only two locations (see Table 1), increasing the number of developed to handle tasks such as infant cry motivation [7], music genre classification 165 [8] and music mood classification [9]. The visual domain has also been used with animal 166 vocalizations, in tasks as species identification and detection [4,10]. 191…”

Section: Introduction 18mentioning

confidence: 99%

“…In case of spectrograms 182 in particular, texture is a very prominent visual property. In this vein, the textural 183 content of spectrograms has been used in several audio classification tasks, such as 184 music genre classification[11], voice classification[12], birds species classification and 185 whales recognition[4].186In[13], the authors propose the Local Binary Pattern (LBP). The texture of an187 image is described with a histogram.…”

mentioning

confidence: 99%

On the Importance of Passive Acoustic Monitoring Filters

Aguiar¹,

Maguolo²,

Nanni³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Passive acoustic monitoring (PAM) is a non-invasive technique to supervise the wildlife. Acoustic surveillance is preferable in some situation such as in the case of marine mammals, when the animals spend most of their time underwater, making it hard to obtain their images. Machine learning is very useful for PAM, for example, to identify species based on audio recordings. But some care should be taken to evaluate the capability of a system. We deﬁne PAM-ﬁlters as the creation of the experimental protocols according to the dates and locations of the recordings, aiming to avoid the use of the same individuals, noise and recording devices in both training and test sets. A random division of a database present accuracies much higher than accuracies obtained with protocols generated with PAM-ﬁlter. Although we use the animal vocalizations, in our method we convert the audio into spectrogram images, after that, we describe the images using the texture. Those are well-known techniques for audio classiﬁcation, and they have already been used for species classiﬁcation. Also, we perform statistical tests to demonstrate the signiﬁcant difference between accuracies generated with and without PAM-ﬁlters with several well-known classiﬁers. The conﬁguration of our experimental protocols and the database were made available online.

show abstract

Bird and whale species identification using sound images

Cited by 13 publications

References 54 publications

Classification of Heart Sound Signal Using Multiple Features

Classification of Heart Sound Signal Using Multiple Features

Automatic Classification of Monosyllabic and Multisyllabic Birds Using PDHF

On the Importance of Passive Acoustic Monitoring Filters

Contact Info

Product

Resources

About