Support vector machines (SVMs) have gained great attention and have been used extensively and successfully in the field of sounds (events) recognition. However, the extension of SVMs to real-world signal processing applications is still an ongoing research topic. Our work consists of illustrating the potential of SVMs on recognizing impulsive audio signals belonging to a complex realworld dataset. We propose to apply optimized one-class support vector machines (1-SVMs) to tackle both sound detection and classification tasks in the sound recognition process. First, we propose an efficient and accurate approach for detecting events in a continuous audio stream. The proposed unsupervised sound detection method which does not require any pretrained models is based on the use of the exponential family model and 1-SVMs to approximate the generalized likelihood ratio. Then, we apply novel discriminative algorithms based on 1-SVMs with new dissimilarity measure in order to address a supervised sound-classification task. We compare the novel sound detection and classification methods with other popular approaches. The remarkable sound recognition results achieved in our experiments illustrate the potential of these methods and indicate that 1-SVMs are well suited for event-recognition tasks.
The problem of acoustic-to-articulatory speech inversion continues to be a challenging research problem which significantly impacts automatic speech recognition robustness and accuracy. This paper presents a multi-task kernel based method aimed at learning Vocal Tract (VT) variables from the Mel-Frequency Cepstral Coefficients (MFCCs). Unlike usual speech inversion techniques based on individual estimation of each tract variable, the key idea here is to consider all the target variables simultaneously to take advantage of the relationships among them and then improve learning performance. The proposed method is evaluated using synthetic speech dataset and corresponding tract variables created by the TAsk Dynamics Application (TADA) model and compared to the hierarchical ε-SVR speech inversion technique.Index Terms-Multi-task learning, matrix-valued kernel, vocal tract variables, acoustic-to-articulatory inversion.
We consider the quantum version of the bandit problem known as best arm identification (BAI). We first propose a quantum modeling of the BAI problem, which assumes that both the learning agent and the environment are quantum; we then propose an algorithm based on quantum amplitude amplification to solve BAI. We formally analyze the behavior of the algorithm on all instances of the problem and we show, in particular, that it is able to get the optimal solution quadratically faster than what is known to hold in the classical case.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.