2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010
DOI: 10.1109/icassp.2010.5495697
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive kernel canonical correlation analysis for estimation of task dynamics from acoustics

Abstract: We present a method for acoustic-articulatory inversion whose targets are the abstract tract variables from task dynamic theory. Towards this end we construct a non-linear Hammerstein system whose parameters are updated with adaptive kernel canonical correlation analysis. This approach is notably semi-analytical and applicable to large sets of data. Training behaviour is compared across four kernel functions and prediction of tract variables is shown to be significantly more accurate than state-of-the-art mixt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2011
2011
2016
2016

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 13 publications
0
6
0
Order By: Relevance
“…A common approach to extracting this space is through canonical correlation analysis (CCA), which finds pairs of maximally correlated projections of the data in the two views. CCA has been successfully applied to various tasks in speech [21,22,23,24,25,26], natural language processing [27], and computer vision [28,29]. A closely related technique is Partial Least Squares (PLS), which finds pairs of maximally covarying projections of the data in the two views.…”
Section: Stochastic Optimization For Plsmentioning
confidence: 99%
“…A common approach to extracting this space is through canonical correlation analysis (CCA), which finds pairs of maximally correlated projections of the data in the two views. CCA has been successfully applied to various tasks in speech [21,22,23,24,25,26], natural language processing [27], and computer vision [28,29]. A closely related technique is Partial Least Squares (PLS), which finds pairs of maximally covarying projections of the data in the two views.…”
Section: Stochastic Optimization For Plsmentioning
confidence: 99%
“…In this work, we applied one exemplar-based speaker independent acoustic-to-articulatory inversion methods based on Ghosh and Narayanan (2011) and one deep neural network (DNN) based approach based on Uria et al (2011) to generate the estimated articulatory signals. It is worth noting that other types of acoustic-to-articulatory mapping, such as CCA (Bharadwaj et al, 2012; Arora and Livescu, 2013), Kernel CCA (Rudzicz, 2010; Arora and Livescu, 2013), Gaussian Mixture Model (GMM) (Ghosh and Narayanan, 2013; Ozbek et al, 2011; Özbek et al, 2012), attributes classification (Leung et al, 2004; Zhang et al, 2007; Siniscalchi et al, 2013, 2012) and articulatory phonological code (Zhuang et al, 2009), etc., could also be applied here. The reason to choose the exemplar-based speaker independent acoustic-to-articulatory inversion methods is that we can directly compare the performance against the real articulatory trajectories measurement to find out the gap which shows the potential for better speaker aware inversion techniques.…”
Section: Introductionmentioning
confidence: 99%
“…The intuition is that articulatory measurements provide information about the linguistic content, and that much of the non-discriminative information in the two views is largely uncorrelated and therefore filtered out. CCA/KCCA have also been used with audio and video for speaker clustering [22] and identification [23]; for speaker normalization [24], where the views are the speakers; for articulatory inversion [25]; and to study critical articulators [26].…”
Section: Introductionmentioning
confidence: 99%