2022
DOI: 10.3390/s22083033
|View full text |Cite
|
Sign up to set email alerts
|

Musical Instrument Identification Using Deep Learning Approach

Abstract: The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata description and a review of related works. This is followed by showing the datas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 35 publications
(71 reference statements)
0
10
0
Order By: Relevance
“…Abd-AlGalil et al [10] extract MFCC coefficient (mel frequency cepstral coefficient, MFCC) from signal using auditory characteristics of the human ear and then detect the starting point based on cepstral distance with an accuracy of 96%. Schönberger [11] records a moment of note onset based on the result of phase difference, while Blaszke and Kostek [12] first preprocess music signal in full phase and then detect note onset using a feature of phase difference mutation, and experimental results show that this type of method is more suitable for note detection of a slow rhythm music. Alqahtani et al [13] combined wavelet domain and time domain features for note slicing and achieved 96% accuracy for detecting the starting point of piano music, but the number of missed notes was high, resulting in a low recall rate [14].…”
Section: Related Workmentioning
confidence: 99%
“…Abd-AlGalil et al [10] extract MFCC coefficient (mel frequency cepstral coefficient, MFCC) from signal using auditory characteristics of the human ear and then detect the starting point based on cepstral distance with an accuracy of 96%. Schönberger [11] records a moment of note onset based on the result of phase difference, while Blaszke and Kostek [12] first preprocess music signal in full phase and then detect note onset using a feature of phase difference mutation, and experimental results show that this type of method is more suitable for note detection of a slow rhythm music. Alqahtani et al [13] combined wavelet domain and time domain features for note slicing and achieved 96% accuracy for detecting the starting point of piano music, but the number of missed notes was high, resulting in a low recall rate [14].…”
Section: Related Workmentioning
confidence: 99%
“…DNN have been used successfully in widespread applications from speech-based emotion recognition [ 45 – 47 ] to music recognition [ 48 ] to detecting emotion in music [ 49 ]. Recent work with DNNs have shown immense potential for musical instrument classification as reviewed by Blaszke and Kostek [ 50 ], particularly for predominant instrument recognition in polyphonic audio. To our knowledge, no previous work with DNNs have explored their application to the percussion instruments (maracas, tambourines, castanets) or sound environment (e.g.…”
Section: Discussionmentioning
confidence: 99%
“…While the classification accuracy obtained by the methods described herein appeared to provide children with a good user experience, DNNs might be a promising direction of exploration particularly if more instrument families are added to the system. As well reviewed by Blaszke and Kostek [ 50 ], the current state of the art for multiple instrument recognition yields F1 scores around 0.64 while their DNN approach provided substantial increases to 0.93 [ 50 ]. DNN approaches may also offer greater flexibility allowing for more complex models for instruments that are difficult to classify and simpler, more computationally efficient models for instruments that are easily identified [ 50 ].…”
Section: Discussionmentioning
confidence: 99%
“…Judging from the consistent positive outcomes, it only makes sense to assume that in the future, AM-enhanced NNs will be extensively used for MIR. In [97], identification is performed for four instruments: bass, drums, piano, and guitar. The model architecture consists of four identical, independent sub-models, each catering to one instrument.…”
Section: Convolutional Neural Network (Cnn)mentioning
confidence: 99%