Maksim Khadkevich scite author profile

In this work we propose approaches to effectively transfer knowledge from weakly labeled web audio data. We first describe a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data. Our model trains efficiently from audios of variable lengths; hence, it is well suited for transfer learning. We then propose methods to learn representations using this model which can be effectively used for solving the target task. We study both transductive and inductive transfer learning tasks, showing the effectiveness of our methods for both domain and task adaptation. We show that the learned representations using the proposed CNN model generalizes well enough to reach human level accuracy on ESC-50 sound events dataset and sets state of art results on this dataset. We further use them for acoustic scene classification task and once again show that our proposed approaches suit well for this task as well. We also show that our methods are helpful in capturing semantic meanings and relations as well. Moreover, in this process we also set state-of-art results on Audioset dataset using balanced training set.

show abstract

A probabilistic approach to simultaneous extraction of beats and downbeats

Khadkevich

Fillon

Richard

et al. 2012

View full text Add to dashboard Cite

This paper focuses on the automatic extraction of beat structure from a musical piece. A novel statistical approach to modeling beat se quences based on the application of Hidden Markov Models (HMM) is introduced. The resulting beat labels are obtained by running the Viterbi decoder and subsequent lattice rescoring. For the observation vectors we propose a new feature set that is based on the impulsive and harmonic components of the reassigned spectrogram. Different components of observation vectors have been investigated for their efficiency. The main advantage of the proposed approach is the ab sence of imposed deterministic rules. All the parameters are learned from the training data, and the experimental results show the effi ciency of the proposed schema.

show abstract

Time-frequency reassigned features for automatic chord recognition

Khadkevich

Omologo

2011

View full text Add to dashboard Cite

This paper addresses feature extraction for automatic chord recognition systems. Most chord recognition systems use chroma features as a front-end and some kind of classifier (HMM, SVM or template matching). The vast majority of feature extraction approaches are based on mapping frequency bins from spectrum or constant-Q spectrum to chroma bins. In this work a set of new chroma features that are based on the time-frequency reassignment (TFR) technique is investigated. The proposed feature set was evaluated on the commonly used Beatles dataset and proved to be efficient for the chord recognition task, outperforming standard chroma.

show abstract

Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes

Kumar¹,

Khadkevich²,

Fügen³

2017

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.