Tomoyosi Akiba scite author profile

Tomoyosi Akiba

5Publications

25Citation Statements Received

45Citation Statements Given

How they've been cited

How they cite others

Affiliations

Toyohashi University of Technology, National Institute of Advanced Industrial Science and Technology, Tokyo Institute of Technology

Publications

Order By: Most citations

A Bayesian approach for user modeling in dialogue systems

Akiba

Hozumi

1994

View full text Add to dashboard Cite

User modeling is an iml>ortant COlnponents of dialog systems. Most previous approaches are rule-based methods, hi this paper, we proimse to represent user models through Bayesian networks. Some advantages of the Bayesian approach over the rule-based approach are as follows. First, rules for updating user models are not necessary because upility theory; this provides us a more formal way of dealing with uncertainties. Second, the Bayesian network pro: rides more detailed information of users' knowledge, because the degree of belief on each concept is provided in terms of prol~ability. We prove these advantages through a prelinfinary experiment.

show abstract

Rapid Speaker Adaptation of Neural Network Based Filterbank Layer for Automatic Speech Recognition

Seki

Yamamoto

Akiba

et al. 2018

View full text Add to dashboard Cite

Language model adaptation for fixed phrases by amplifying partial n‐gram sequences

Akiba

Itou

Fuji

2007

Systems & Computers in Japan

View full text Add to dashboard Cite

SUMMARYWe propose a method for creating an N-gram language model for use in a speech-operated question-answering system. We note that input questions to such a system frequently consist of an initial section, relating to the query topic, and a formulaic sentence final expression that is used in questions (a fixed phrase). While we are able to model the initial sections adequately using the target query newspaper corpus, we are not able to model the fixed phrases adequately with this data source. In this paper we frame the problem as one of adapting a language model created using a generic corpus to fixed phrases and propose a language model adaptation method that makes use only of a list of fixed phrases created by hand, rather than attempting the more difficult task of collecting an adaptation corpus. In the proposed method we determine which sections in the generic corpus correspond to N-gram sequences on the list of fixed phrases, and perform language model adaptation by amplifying the probabilities of those N-grams; this is equivalent to performing maximum a posteriori (MAP) estimation treating these partial N-gram sequences from the generic corpus itself as posterior information. We perform recognition experiments with spoken questions consisting of input to a question-answering system and confirm the effectiveness of the proposed method.

show abstract

Incorporating syllable duration into line-detection-based spoken term detection

Ohno

Akiba

2012

View full text Add to dashboard Cite

Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation

Seki

Yamamoto

Akiba

et al. 2019

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Deep neural networks (DNNs) have achieved significant success in the field of automatic speech recognition. One main advantage of DNNs is automatic feature extraction without human intervention. However, adaptation under limited available data remains a major challenge for DNN-based systems because of their enormous free parameters. In this paper, we propose a filterbank-incorporated DNN that incorporates a filterbank layer that presents the filter shape/center frequency and a DNN-based acoustic model. The filterbank layer and the following networks of the proposed model are trained jointly by exploiting the advantages of the hierarchical feature extraction, while most systems use predefined mel-scale filterbank features as input acoustic features to DNNs. Filters in the filterbank layer are parameterized to represent speaker characteristics while minimizing a number of parameters. The optimization of one type of parameters corresponds to the Vocal Tract Length Normalization (VTLN), and another type corresponds to feature-space Maximum Linear Likelihood Regression (fMLLR) and feature-space Discriminative Linear Regression (fDLR). Since the filterbank layer consists of just a few parameters, it is advantageous in adaptation under limited available data. In the experiment, filterbank-incorporated DNNs showed effectiveness in speaker/gender adaptations under limited adaptation data. Experimental results on CSJ task demonstrate that the adaptation of proposed model showed 5.8% word error reduction ratio with 10 utterances against the un-adapted model.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tomoyosi Akiba

A Bayesian approach for user modeling in dialogue systems

Rapid Speaker Adaptation of Neural Network Based Filterbank Layer for Automatic Speech Recognition

Language model adaptation for fixed phrases by amplifying partial n‐gram sequences

Incorporating syllable duration into line-detection-based spoken term detection

Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation

Contact Info

Product

Resources

About