Ryota Nishimura scite author profile

SummaryIf a dialog system can respond to the user as reasonably as a human, the interaction will become smoother. Timing of the response such as back-channels and turn-taking plays an important role in such a smooth dialog as in human-human interaction. We developed a response timing generator for such a dialog system. This generator uses a decision tree to detect the timing based on the features coming from some prosodic and linguistic information. The timing generator decides the action of the system at every 100 ms during the user's pause. In this paper, we describe a robust spoken dialog system using the timing generator. Subjective evaluation proved that almost all of the subjects experienced a friendly feeling from the system.

show abstract

A Spoken Dialog System for Chat-Like Conversations Considering Response Timing

Nishimura

Kitaoka

Nakagawa

View full text Add to dashboard Cite

Voice interaction system with 3D-CG virtual agent for stand-alone smartphones

Yamamoto

Oura

Nishimura

et al. 2014

View full text Add to dashboard Cite

In this paper, we propose a voice interaction system using 3D-CG virtual agents for stand-alone smartphones. Because the proposed system can handle speech recognition and speech synthesis on a stand-alone smartphone differently from the existing mobile voice interaction systems, this system enables us to talk naturally without encountering delays caused by network communications. Moreover, proposed system can be fully customized by dialogue scripts, Java-based plugins, and Android APIs. Therefore, developers can make original voice interaction systems for smartphones easily based on proposed system. We have made a subset of the proposed system available as opensource software. We expect that this system will contribute to studies of human-agent interaction using smartphones.

show abstract

Small-Footprint Magic Word Detection Method Using Convolutional LSTM Neural Network

Yamamoto¹,

Nishimura²,

Misaki³

et al. 2019

View full text Add to dashboard Cite

The number of consumer devices which can be operated by voice is increasing every year. Magic Word Detection (MWD), the detection of an activation keyword in continuous speech, has become an essential technology for the hands-free operation of such devices. Because MWD systems need to run constantly in order to detect Magic Words at any time, many studies have focused on the development of a small-footprint system. In this paper, we propose a novel, small-footprint MWD method which uses a convolutional Long Short-Term Memory (LSTM) neural network to capture frequency and time domain features over time. As a result, the proposed method outperforms the baseline method while reducing the number of parameters by more than 80%. An experiment on a small-scale device demonstrates that our model is efficient enough to function in real time.

show abstract

Development and Evaluation of Spoken Dialog Systems with One or Two Agents through Two Domains

Todo

Nishimura

Yamamoto

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ryota Nishimura

Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems

A Spoken Dialog System for Chat-Like Conversations Considering Response Timing

Voice interaction system with 3D-CG virtual agent for stand-alone smartphones

Small-Footprint Magic Word Detection Method Using Convolutional LSTM Neural Network

Development and Evaluation of Spoken Dialog Systems with One or Two Agents through Two Domains

Contact Info

Product

Resources

About