Juan Benjumea scite author profile

We describe a new challenge aimed at discovering subword and word units from raw speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It aims at constructing systems that generalize across languages and adapt to new speakers. The design features and evaluation metrics of the challenge are presented and the results of seventeen models are discussed.Index Terms-zero resource speech technology, subword modeling, acoustic unit discovery, unsupervised term discovery

show abstract

The Zero Resource Speech Challenge 2019: TTS Without T

Dunbar¹,

Algayres²,

Karadayi³

et al. 2019

107

122

View full text Add to dashboard Cite

The Zero Resource Speech Challenge 2019: TTS without T

Dunbar¹,

Algayres²,

Karadayi³

et al. 2019

Preprint

View full text Add to dashboard Cite

We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text). We provide raw audio for a target voice in an unknown language (the Voice dataset), but no alignment, text or labels. Participants must discover subword units in an unsupervised way (using the Unit Discovery dataset) and align them to the voice recordings in a way that works best for the purpose of synthesizing novel utterances from novel speakers, similar to the target speaker's voice. We describe the metrics used for evaluation, a baseline system consisting of unsupervised subword unit discovery plus a standard TTS system, and a topline TTS using gold phoneme transcriptions. We present an overview of the 19 submitted systems from 10 teams and discuss the main results.

show abstract

The Zero Resource Speech Challenge 2017

Dunbar¹,

Cao²,

Benjumea³

et al. 2017

Preprint

View full text Add to dashboard Cite

A K-Nearest Neighbours Approach To Unsupervised Spoken Term Discovery

Alexis

Dancette

Karadayi

et al. 2018

View full text Add to dashboard Cite

Unsupervised spoken term discovery is the task of finding recurrent acoustic patterns in speech without any annotations. Current approaches consists of two steps: (1) discovering similar patterns in speech, and (2) partitioning those pairs of acoustic tokens using graph clustering methods. We propose a new approach for the first step. Previous systems used various approximation algorithms to make the search tractable on large amounts of data. Our approach is based on an optimized k-nearest neighbours (KNN) search coupled with a fixed word embedding algorithm. The results show that the KNN algorithm is robust across languages, consistently outperforms the DTW-based baseline, and is competitive with current state-of-the-art spoken term discovery systems.

show abstract

bootphon/phonemizer: phonemizer-2.2

Bernard¹,

hadware²,

Riad³

et al. 2020

View full text Add to dashboard Cite

Response Optimization of a Chemical Gas Sensor Array Using Temperature Modulation

Acevedo¹,

Benjumea²

2018

Preprint

View full text Add to dashboard Cite

This paper consists in the design and implementation of a simple conditioning circuit to optimize the electronic nose performance, where a temperature modulation method was applied to the heating resistor, in order to study the sensor’s response and determine whether they are able to make the discrimination when are exposed to different Volatile Organic Compounds (VOC’s). This study was based on determining the efficiency of the gas sensors to be used in order to perform an Electronic Nose, improving the sensitivity, selectivity and repeatability of the measuring system and selecting the type of modulation (e.g. Pulse Width Modulation) for the analytes detection (i.e, Moscatel wine samples (2% of Alcohol) and Ethyl-Alcohol (70%)). The results demonstrated that using temperature modulation technique to the heater of sensors, it is possible to achieve a good discrimination of VOC's in fast and easy form, through a chemical sensors array. A discrimination model based on Principal Component Analysis (PCA) was implemented to each sensor, and data responses obtained gave a variance of 94.5% and 100% accuracy.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Juan Benjumea

The zero resource speech challenge 2017

The Zero Resource Speech Challenge 2019: TTS Without T

The Zero Resource Speech Challenge 2019: TTS without T

The Zero Resource Speech Challenge 2017

A K-Nearest Neighbours Approach To Unsupervised Spoken Term Discovery

bootphon/phonemizer: phonemizer-2.2

Response Optimization of a Chemical Gas Sensor Array Using Temperature Modulation

Contact Info

Product

Resources

About