A cloud robotics approach towards dialogue-oriented robot speech

Sugiura, Komei; Shiga, Yoshinori; Kawai, Hisashi; Misu, Teruhisa; Hori, Chiori

doi:10.1080/01691864.2015.1009164

Cited by 16 publications

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result of our screening, we obtained 720 short-dialogue lines. No UUDB [22] Task-oriented Yes 2 14 Yes OGVC [23] Voice chat while playing game Yes 14 17 Yes NICT-VADC [24] Task-oriented Yes 7…”

Section: Crowdsourcing Setting and Resultsmentioning

confidence: 99%

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Yuki¹,

Nishimura²,

Takamichi³

et al. 2022

Preprint

View full text Add to dashboard Cite

We present STUDIES, a new speech corpus for developing a voice agent that can speak in a friendly manner. Humans naturally control their speech prosody to empathize with each other. By incorporating this "empathetic dialogue" behavior into a spoken dialogue system, we can develop a voice agent that can respond to a user more naturally. We designed the STUDIES corpus to include a speaker who speaks with empathy for the interlocutor's emotion explicitly. We describe our methodology to construct an empathetic dialogue speech corpus and report the analysis results of the STUDIES corpus. We conducted a text-to-speech experiment to initially investigate how we can develop more natural voice agent that can tune its speaking style corresponding to the interlocutor's emotion. The results show that the use of interlocutor's emotion label and conversational context embedding can produce speech with the same degree of naturalness as that synthesized by using the agent's emotion label. Our project page of the STUDIES corpus is http://sython.org/Corpus/STUDIES.

show abstract

Section: Crowdsourcing Setting and Resultsmentioning

confidence: 99%

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Yuki¹,

Nishimura²,

Takamichi³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…As a consequence, remote control of robot systems can be performed in a cloud platform through data connection with a robot manipulator. Important applications of cloud robotics can be found in space exploration, remote surgery, intelligent housing systems, unmanned vehicles, and so on [4][5][6][7][8][9].…”

Section: Introductionmentioning

confidence: 99%

Predictor-Based Motion Tracking Control for Cloud Robotic Systems with Delayed Measurements

Shen

Song

2019

Electronics

View full text Add to dashboard Cite

This paper addresses the problem of motion prediction and tracking control for cloud robotic systems with time-varying delays in measurements. A novel method using an observer-based structure for position and velocity prediction is developed to estimate the real-time information of robot manipulator. The prediction error can converge to zero even if model uncertainties exist in the robot manipulator. Based on the predicted positions and velocities, some sufficient conditions are derived to design suitable tracking controllers such that semi-globally uniformly ultimately bounded tracking performance of the predictor–controller couple can be guaranteed. Finally, the effectiveness and robustness to model uncertainties of the proposed method are verified by a two degree-of-freedom (DOF) robot system.

show abstract

“…Most available ASRs are trained with transcribed data that need to be prepared separately from the learning process (Sugiura et al, 2015;Kawaharay et al, 2000;Dahl et al, 2012). By using certain supervised learning methods and certain model architectures, an ASR can be developed with a very large amount of transcribed speech data corpus, i.e., a set of pairs of text data and acoustic data.…”

Section: Introductionmentioning

confidence: 99%

Unsupervised Phoneme and Word Discovery From Multiple Speakers Using Double Articulation Analyzer and Neural Network With Parametric Bias

2019

View full text Add to dashboard Cite

This paper describes a new unsupervised machine learning method for simultaneous phoneme and word discovery from multiple speakers. Human infants can acquire knowledge of phonemes and words from interactions with his/her mother as well as with others surrounding him/her. From a computational perspective, phoneme and word discovery from multiple speakers is a more challenging problem than that from one speaker because the speech signals from different speakers exhibit different acoustic features. This paper proposes an unsupervised phoneme and word discovery method that simultaneously uses nonparametric Bayesian double articulation analyzer (NPB-DAA) and deep sparse autoencoder with parametric bias in hidden layer (DSAE-PBHL). We assume that an infant can recognize and distinguish speakers based on certain other features, e.g., visual face recognition. DSAE-PBHL is aimed to be able to subtract speaker-dependent acoustic features and extract speaker-independent features. An experiment demonstrated that DSAE-PBHL can subtract distributed representations of acoustic signals, enabling extraction based on the types of phonemes rather than on the speakers. Another experiment demonstrated that a combination of NPB-DAA and DSAE-PB outperformed the available methods in phoneme and word discovery tasks involving speech signals with Japanese vowel sequences from multiple speakers.

show abstract

A cloud robotics approach towards dialogue-oriented robot speech

Cited by 16 publications

References 15 publications

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Predictor-Based Motion Tracking Control for Cloud Robotic Systems with Delayed Measurements

Unsupervised Phoneme and Word Discovery From Multiple Speakers Using Double Articulation Analyzer and Neural Network With Parametric Bias

Contact Info

Product

Resources

About