Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-926
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing the Contribution of Top-Down Lexical and Bottom-Up Acoustic Cues in the Detection of Sentence Prominence

Abstract: Recent work has suggested that prominence perception could be driven by the predictability of the acoustic prosodic features of speech. On the other hand, lexical predictability and part of speech information are also known to correlate with prominence. In this paper, we investigate how the bottom-up acoustic and top-down lexical cues contribute to sentence prominence by using both types of features in unsupervised and supervised systems for automatic prominence detection. The study is conducted using a corpus… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
3
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3

Relationship

4
3

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 34 publications
(25 reference statements)
2
3
0
Order By: Relevance
“…Furthermore, the results from the binary prominence classification provided additional evidence of the importance of the three acoustic correlates of prominence for Dutch. These results are also close to those of an earlier study on the same data [13].…”
Section: Discussionsupporting
confidence: 92%
See 1 more Smart Citation
“…Furthermore, the results from the binary prominence classification provided additional evidence of the importance of the three acoustic correlates of prominence for Dutch. These results are also close to those of an earlier study on the same data [13].…”
Section: Discussionsupporting
confidence: 92%
“…Prominence is a prosodic phenomenon that conveys the subjective impression of emphasis and is defined as the perception of a linguistic unit standing out from its environment (see [4,5,6] for related definitions). Earlier, many studies focused on determining the acoustic correlates of prominence [7,8,9], and, more recently, on methods for its automatic detection [10,11,12,13,14]. One interesting aspect on the study of prominence that has been enabled by the success of deep neural networks (DNNs), and that has not been widely explored, is whether DNNs are capable of learning prominence-like representations of speech.…”
Section: Introductionmentioning
confidence: 99%
“…Specifically, it was observed (i) that the tilt measures behave differently for the distinct statistical descriptors, (ii) there are differences in the performance (class separability) for the distinct tilt measures in clean speech, and (iii) noise degradation greatly impacts prominence class separation from around 10 dB SNR, largely diminishing the differences between the estimators. The results also support the finding that tilt is an important correlate for prominence, at least for Dutch (see, e.g., [10,33], see also [34]). Fig.…”
Section: Discussionsupporting
confidence: 87%
“…The results showed that the best overall feature combination was that of energy, F0, and duration in both corpora whereas combinations of different tilt measures alone and together with energy, F0, and duration, did not seem to improve class separation. Although surprising, earlier studies have also observed little improvements with the addition of tilt measures in supervised classification tasks (see, e.g., Kakouros, Pelemans, Verwimp, Wambacq, & Räsänen, 2016;Streefkerk, Pols, & ten Bosch, 1999). To further examine the effect of tilt in prominence class separation performance, a combination of the best performing basic features (energy, F0, duration) together with the single best performing tilt measures from the source (DNNC6D) and surface (SLF6D) groups were tested and showed no improvements beyond what was achieved with a combination of all basic features.…”
Section: Discussionmentioning
confidence: 91%