2013
DOI: 10.1121/1.4812765
|View full text |Cite
|
Sign up to set email alerts
|

Spatio-temporal articulatory movement primitives during speech production: Extraction, interpretation, and validation

Abstract: This paper presents a computational approach to derive interpretable movement primitives from speech articulation data. It puts forth a convolutive Nonnegative Matrix Factorization algorithm with sparseness constraints (cNMFsc) to decompose a given data matrix into a set of spatiotemporal basis sequences and an activation matrix. The algorithm optimizes a cost function that trades off the mismatch between the proposed model and the input data against the number of primitives that are active at any given instan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
43
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 32 publications
(43 citation statements)
references
References 39 publications
0
43
0
Order By: Relevance
“…They derive these multiplicative updates by choosing an adaptive learning rate that makes additive terms cancel from standard gradient descent on the cost function. We will reproduce their derivation here, and detail how to extend it to the convolutional case [41] apply several forms of regularization [30,36,7]. See Table 2 for a compilation of cost functions, derivatives and multiplicative updates for NMF and CNMF under several different regularization conditions.…”
Section: Deriving Multiplicative Update Rulesmentioning
confidence: 99%
“…They derive these multiplicative updates by choosing an adaptive learning rate that makes additive terms cancel from standard gradient descent on the cost function. We will reproduce their derivation here, and detail how to extend it to the convolutional case [41] apply several forms of regularization [30,36,7]. See Table 2 for a compilation of cost functions, derivatives and multiplicative updates for NMF and CNMF under several different regularization conditions.…”
Section: Deriving Multiplicative Update Rulesmentioning
confidence: 99%
“…Techniques such as Electromagnetic Articulography, offering excellent temporal resolution, and Magnetic Resonance Imaging, offering superior spatial resolution, have enabled a much better analysis of the speech production process (Ramanarayanan et al, 2013). Many insights have thus been gained regarding the spatio-temporal details of speech generation.…”
Section: Robust Feature Extractionmentioning
confidence: 99%
“…These models are usually obtained by Magnetic Resonance Imaging (Badin et al 2002; Engwall 2000; Badin et al 2008) and animated by electromagnetic articulography data (Engwall 2003; Gibert et al 2012; Steiner et al 2013) or ultrasound images (Fabre et al 2014). Computational approaches such as convolutive Nonnegative Matrix Factorization could be used to derive interpretable movement primitives from speech production data (Ramanarayanan et al 2013). These speech movement primitives can be used to animate virtual agents’ speech articulators for a given set of activation data.…”
Section: Introductionmentioning
confidence: 99%