2007
DOI: 10.1093/ietisy/e90-d.2.533
|View full text |Cite
|
Sign up to set email alerts
|

Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
115
2

Year Published

2012
2012
2016
2016

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 142 publications
(119 citation statements)
references
References 0 publications
2
115
2
Order By: Relevance
“…This section aims to explain the predominant statistical modeling approach applied in speech synthesis, i.e., context-dependent multi-space probability distribution left-to-right without skip transitions HSMM [3,14] (simply called HSMM in the remainder of this paper). The discussion presented in this section provides a preliminary framework which will be used as a basis to introduce the proposed HMEM technique in Section 3.…”
Section: Hsmm-based Speech Synthesismentioning
confidence: 99%
See 3 more Smart Citations
“…This section aims to explain the predominant statistical modeling approach applied in speech synthesis, i.e., context-dependent multi-space probability distribution left-to-right without skip transitions HSMM [3,14] (simply called HSMM in the remainder of this paper). The discussion presented in this section provides a preliminary framework which will be used as a basis to introduce the proposed HMEM technique in Section 3.…”
Section: Hsmm-based Speech Synthesismentioning
confidence: 99%
“…Also, α t (i) and βt(i) are partial forward and backward probability variables that are calculated successively from their previous or next values as follows [3,14]:…”
Section: Hsmm Likelihoodmentioning
confidence: 99%
See 2 more Smart Citations
“…The mean vectors and covariance matrices of state output distributions of the target speakers model are obtained by linearly transforming the mean vectors and covariance matrices of state output distributions of the source speaker's model [16]. The same idea lies for CMLLR.…”
Section: Conception Of the Speech Synthesizersmentioning
confidence: 99%