Dimensions of Segmental Variability: Interaction of Prosody and Surprisal in Six Languages

Malisz, Zofia; Brandt, Erika; Möbius, Bernd; Oh, Yoon Mi; Andreeva, Bistra

doi:10.3389/fcomm.2018.00025

Cited by 14 publications

(20 citation statements)

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The RMarkdown script continuing all the R code needed to reproduce the results and plots reported here. References (50)(51)(52)(53)(54)(55)(56)(57)(58)(59) program (ANR-11-IDEX-0007) operated by the National Research Agency (ANR). Author contributions: C.C., Y.O., and F.P.…”

Section: Supplementary Materialsmentioning

confidence: 99%

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

et al. 2019

Self Cite

View full text Add to dashboard Cite

Language is universal, but it has few indisputably universal characteristics, with cross-linguistic variation being the norm. For example, languages differ greatly in the number of syllables they allow, resulting in large variation in the Shannon information per syllable. Nevertheless, all natural languages allow their speakers to efficiently encode and transmit information. We show here, using quantitative methods on a large cross-linguistic corpus of 17 languages, that the coupling between language-level (information per syllable) and speaker-level (speech rate) properties results in languages encoding similar information rates (~39 bits/s) despite wide differences in each property individually: Languages are more similar in information rates than in Shannon information or speech rate. These findings highlight the intimate feedback loops between languages’ structural properties and their speakers’ neurocognition and biology under communicative pressures. Thus, language is the product of a multiscale communicative niche construction process at the intersection of biology, environment, and culture.

show abstract

Section: Supplementary Materialsmentioning

confidence: 99%

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…The tendency to reduce predictable expressions has been observed on different levels of linguistic analysis, ranging from phonetics [1,[38][39][40][41][42][43][44][45][46][47] and morphology [48] to the omission of predictable words [4,[6][7][8][9]49]. If this principle applies to fragments as well, we expect that fragments are more strongly preferred over the corresponding full sentence if the omission of words that are predictable in a specific context results in a well-formed fragment.…”

Section: Predictability Effects On Omissions In Fragmentsmentioning

confidence: 99%

Modeling the predictive potential of extralinguistic context with script knowledge: The case of fragments

2021

View full text Add to dashboard Cite

We describe a novel approach to estimating the predictability of utterances given extralinguistic context in psycholinguistic research. Predictability effects on language production and comprehension are widely attested, but so far predictability has mostly been manipulated through local linguistic context, which is captured with n-gram language models. However, this method does not allow to investigate predictability effects driven by extralinguistic context. Modeling effects of extralinguistic context is particularly relevant to discourse-initial expressions, which can be predictable even if they lack linguistic context at all. We propose to use script knowledge as an approximation to extralinguistic context. Since the application of script knowledge involves the generation of prediction about upcoming events, we expect that scrips can be used to manipulate the likelihood of linguistic expressions referring to these events. Previous research has shown that script-based discourse expectations modulate the likelihood of linguistic expressions, but script knowledge has often been operationalized with stimuli which were based on researchers’ intuitions and/or expensive production and norming studies. We propose to quantify the likelihood of an utterance based on the probability of the event to which it refers. This probability is calculated with event language models trained on a script knowledge corpus and modulated with probabilistic event chains extracted from the corpus. We use the DeScript corpus of script knowledge to obtain empirically founded estimates of the likelihood of an event to occur in context without having to resort to expensive pre-tests of the stimuli. We exemplify our method at a case study on the usage of nonsentential expressions (fragments), which shows that utterances that are predictable given script-based extralinguistic context are more likely to be reduced.

show abstract

“…We contend that just as human speech production is highly variable and comes in many different "styles", which are continuously adapted by speakers given dynamically changing social (tutoring, chatting, arguing, counseling...), individual (hearing problems, attitude, level of distraction, motivation, familiarity), linguistic (frequency, predictability, suprisal, importance) or environmental settings (external noise, mutual visibility, ...) [11,12,13,14,15,16,17,18]. Due to this inherent contextual embedding, human speech production can never be "neutral" or "perfectly natural", and no speaking style therefore qualifies as a reference signal that a speech event of inherently less quality, e.g.…”

Section: Contextual Appropriateness As Metric Of Speech Quality?mentioning

confidence: 99%

Speech Synthesis Evaluation — State-of-the-Art Assessment and Suggestion for a Novel Research Program

Wagner¹,

Beskow²,

Betz³

et al. 2019

10th ISCA Workshop on Speech Synthesis (SSW 10)

Self Cite

View full text Add to dashboard Cite

Speech synthesis applications have become an ubiquity, in navigation systems, digital assistants or as screen or audio book readers. Despite their impact on the acceptability of the systems in which they are embedded, and despite the fact that different applications probably need different types of TTS voices, TTS evaluation is still largely treated as an isolated problem. Even though there is strong agreement among researchers that the mainstream approaches to Text-to-Speech (TTS) evaluation are often insufficient and may even be misleading, there exist few clear-cut suggestions as to (1) how TTS evaluations may be realistically improved on a large scale, and (2) how such improvements may lead to an informed feedback for system developers and, ultimately, better systems relying on TTS. This paper reviews the current state-of-the-art in TTS evaluation, and suggests a novel user-centered research program for this area.

show abstract

Dimensions of Segmental Variability: Interaction of Prosody and Surprisal in Six Languages

Cited by 14 publications

References 61 publications

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche

Modeling the predictive potential of extralinguistic context with script knowledge: The case of fragments

Speech Synthesis Evaluation — State-of-the-Art Assessment and Suggestion for a Novel Research Program

Contact Info

Product

Resources

About