Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning

Xu, Yi; Prom-on, Santitham

doi:10.1016/j.specom.2013.09.013

Cited by 46 publications

(61 citation statements)

References 85 publications

(135 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For the purpose of this analysis, only tokens evaluated on a perceptual basis as being uttered with a "normal" rate were selected, and no normalization was applied. The PRAAT scripts ProsodyPro (Xu & Prom-on 2014), and PENTAtrainer 1 (Prom-on, Xu & Thipakorn 2009) were then used to obtain the measurements of the prosodic correlates that include duration (in milliseconds), intensity (in decibels), mean F0 (average of 10 measurements over the syllable, in Hz), and excursion size (F0maxima -minima, in semitones).…”

Section: Methodsmentioning

confidence: 99%

Fronted NPs in a verb-initial language – clause-internal or external? Prosodic cues to the rescue!

Simard

Wegener

2017

Glossa: A Journal of General Linguistics

View full text Add to dashboard Cite

This paper investigates prosodic features of fronted constituents in the verb-initial Oceanic language Gela (spoken by about 16.000 people in Solomon Islands). Although Gela's basic constituent order is verb-(object-)subject/predicate-subject, constituents can appear in front of the verbal predicate. Fronted constituents in Gela can be interpreted as pre-clausal (i.e. external to the following clause, immediately preceding it) or clause-initial (i.e. clause-internal, at the very beginning of the clause), each of which can be associated with certain information structure categories of topics and focus. This paper discusses how prosody provides clues towards the interpretation of fronted constituents as pre-clausal or clause-initial, based on a quantitative study of their prosodic correlates. We argue for using prosodic criteria established on clear examples to help analyse ambiguous cases. The results are compatible with an approach that recognises the importance of prosody in syntactic analysis and contribute data from a little known language to the discussion to what degree prosodic and syntactic phrasing are aligned.

show abstract

Section: Methodsmentioning

confidence: 99%

Fronted NPs in a verb-initial language – clause-internal or external? Prosodic cues to the rescue!

Simard

Wegener

2017

Glossa: A Journal of General Linguistics

View full text Add to dashboard Cite

show abstract

“…Although the majority of modern Chinese words are disyllabic and the uncertainty of disyllabic tonal realizations has partly been explained by individual backgrounds, how disyllabic JM words are realized in connected speech needs further investigation. Researchers have done fruitful studies on contextual tonal realizations and their interactions with sentential prosodies (Chen, 2010;Chen and Gussenhoven, 2008;Xu, 1997Xu, , 1999Xu and Prom-on, 2014;Xu and Wang, 2001). However, how to transfer this knowledge from SC to the other Chinese dialects and how the predictors we investigated work in context still open questions.…”

Section: Limitationsmentioning

confidence: 99%

Predicting tonal realizations in one Chinese dialect from another

Chen

Heuven

et al. 2016

Speech Communication

View full text Add to dashboard Cite

Pronunciation dictionaries are usually expensive and time-consuming to prepare for the computational modeling of human languages, especially when the target language is under-resourced. Northern Chinese dialects are often under-resourced but used by a significant number of speakers. They share the basic sound inventories with Standard Chinese (SC). Also, their words usually share the segmental realizations and logographic written forms with the SC translation equivalents. Hence the pronunciation dictionaries of northern Chinese dialects could be easily available if we were able to predict the tonal realizations of the dialect words from the tonal information of their SC counterparts. This paper applies statistical modeling to investigate the tonal aspect of the related words between a northern dialect, i.e. Jinan Mandarin (JM), and Standard Chinese (SC). Multi-linear regression models were built with between-word pitch distance of JM words as the dependent variable and the following were included as the predictors: SC tonal relations, between-dialect tonal identity, and individual backgrounds. The results showed that tonal relations in SC and between-dialect identity, as predictors featuring the relation between the JM and SC tonal systems, are significant and robust predictors of JM tonal realizations. The speakers' sociolinguistic and cognitive backgrounds, together with the tonal merge and neutral tone information within JM, are important for the prediction of JM tonal realizations and affect the way that between-language predictors take effect.

show abstract

“…PENTA has been implemented to perform both local and global optimization methods [2,9]. The detailed implementation of PENTA with global optimization is given in [9]. Target approximation (TA) in PENTA is mathematically realized as a third-order critically damped linear system driven by pitch targets, as shown in:…”

Section: Target Approximation (Ta) In Penta Modelmentioning

confidence: 99%

“…To test the CPP program, we took subsets of two corpora used previously in the development of PENTAtrainer2 [9]. The first corpus was collected for a study of tone, focus and sentence modality in Mandarin Chinese and the second one was collected for a study of stress, focus and sentence modality in American English [12].…”

Section: Test Datasetmentioning

confidence: 99%

“…In terms of learning algorithms, at the present stage, all four models have been implemented with local curve fitting capabilities, while only PENTA has been implemented with full-fledged global optimization algorithms in PENTAtrainer2 [9]. So this paper will focus mainly on the local fitting capabilities of all models, the software package that provides a means of comparing all models simultaneously and the results of testing them on Mandarin and English test corpora.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

The Common Prosody Platform (CPP): Where theories of prosody can be directly compared

Prom-on

et al. 2016

Speech Prosody 2016

Self Cite

View full text Add to dashboard Cite

This paper introduces the Common Prosody Platform (CPP), a computational platform that implements major theories and models of prosody. CPP aims at a) adapting theory-specific assumptions into computational algorithms that can generate surface prosodic forms, and b) making all the models trainable through global optimization based on automatic analysis-bysynthesis learning. CPP allows examination of prosody in much finer detail than has been previously done and provides a means for speech scientists to directly compare theories and their models. So far, four theories have been included in the platform, the Command-Response model, the AutosegmentalMetrical theory, the Task Dynamic model, and the Parallel Encoding and Target Approximation model. Preliminary tests show that all the implemented models can achieve good local contour fitting with low errors and high correlations.

show abstract

Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning

Cited by 46 publications

References 85 publications

Fronted NPs in a verb-initial language – clause-internal or external? Prosodic cues to the rescue!

Fronted NPs in a verb-initial language – clause-internal or external? Prosodic cues to the rescue!

Predicting tonal realizations in one Chinese dialect from another

The Common Prosody Platform (CPP): Where theories of prosody can be directly compared

Contact Info

Product

Resources

About