2010
DOI: 10.1007/s00354-009-0091-y
|View full text |Cite
|
Sign up to set email alerts
|

Speech Structure and Its Application to Robust Speech Processing

Abstract: Speech communication consists of three steps: production, transmission, and hearing. Every step inevitably involves acoustic distortions due to gender differences, age, microphone-and room-related factors, and so on. In spite of these variations, listeners can extract linguistic information from speech as easily as if the communications had not been affected by variations at all. One may hypothesize that listeners modify their internal acoustic models whenever extralinguistic factors change. Another possibilit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
35
0

Year Published

2011
2011
2018
2018

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 21 publications
(36 citation statements)
references
References 26 publications
(43 reference statements)
1
35
0
Order By: Relevance
“…Minematsu et al proposed a new method of representing speech, called speech structure, and proved that the acoustic variations, corresponding to any linear transformation in the cepstrum domain, can be effectively unseen in the representation [9]. This invariance is due to the invariance of the Bhattacharyya distance (BD), which is calculated using equation 2 and is proved to be invariant with any linear transform.…”
Section: Invariant Pronunciation Structurementioning
confidence: 99%
See 3 more Smart Citations
“…Minematsu et al proposed a new method of representing speech, called speech structure, and proved that the acoustic variations, corresponding to any linear transformation in the cepstrum domain, can be effectively unseen in the representation [9]. This invariance is due to the invariance of the Bhattacharyya distance (BD), which is calculated using equation 2 and is proved to be invariant with any linear transform.…”
Section: Invariant Pronunciation Structurementioning
confidence: 99%
“…The BD is calculated from any pair of distributions and the resulting full set of the BDs forms an invariant distance matrix. This ma- Fig.6 Speaker-independent pronunciation structure Fig.7 Inter-speaker structure difference [12] trix-based representation of an utterance is called pronunciation structure [9]. The structure only represents the local and global contrastive aspects of a given utterance, which is theoretically similar to Jakobson's structural phonology [10].…”
Section: Invariant Pronunciation Structurementioning
confidence: 99%
See 2 more Smart Citations
“…Recently, a novel structural model of pronunciation was proposed [6], which works effectively to remove the nonlinguistic aspects of speech from speech acoustics and keep the linguistic aspects well at the same time. Since the nonlinguistic change of speech features is often modeled as feature transformation, the novel model is based on completely transform-invariant features, which is f-divergence [7].…”
Section: Introductionmentioning
confidence: 99%