2011
DOI: 10.1121/1.3592233
|View full text |Cite
|
Sign up to set email alerts
|

Automatic estimation of voice onset time for word-initial stops by applying random forest to onset detection

Abstract: The voice onset time (VOT) of a stop consonant is the interval between its burst onset and voicing onset. Among a variety of research topics on VOT, one that has been studied for years is how VOTs are efficiently measured. Manual annotation is a feasible way, but it becomes a time-consuming task when the corpus size is large. This paper proposes an automatic VOT estimation method based on an onset detection algorithm. At first, a forced alignment is applied to identify the locations of stop consonants. Then a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2012
2012
2020
2020

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(19 citation statements)
references
References 35 publications
0
17
0
Order By: Relevance
“…The algorithm's performance can then be reported either as the full (empirical) CDF of automatic/manual Lin and Wang, 2011). Reporting statistics about the CDF of automatic/manual differences is a standard evaluation method in ASR tasks, such as forced alignment of phoneme sequences, where the goal is to predict the location of boundaries in a speech segment (e.g., Brugnara et al, 1993;Keshet et al, 2007).…”
Section: Distribution Of Automatic/manual Differencementioning
confidence: 99%
See 2 more Smart Citations
“…The algorithm's performance can then be reported either as the full (empirical) CDF of automatic/manual Lin and Wang, 2011). Reporting statistics about the CDF of automatic/manual differences is a standard evaluation method in ASR tasks, such as forced alignment of phoneme sequences, where the goal is to predict the location of boundaries in a speech segment (e.g., Brugnara et al, 1993;Keshet et al, 2007).…”
Section: Distribution Of Automatic/manual Differencementioning
confidence: 99%
“…Previous work has used automatic measurements for speech recognition tasks Ramesh, 1998, 2003;Ali, 1999;Stouten and van Hamme, 2009), phonetic measurement (Fowler et al, 2008;Tauberer, 2010), and accented speech detection (Kazemzadeh et al, 2006;Hansen et al, 2010). Some studies, like ours, focus largely on the problem of VOT measurement itself, and evaluate the proposed algorithm by comparing automatic and manual measurements (Stouten and van Hamme, 2009;Yao, 2009;Hansen et al, 2010;Lin and Wang, 2011). Our approach differs from all previous studies except one (Lin and Wang, 2011) in an important aspect.…”
Section: Introductionmentioning
confidence: 98%
See 1 more Smart Citation
“…and Van Hamme (RS), 5 the random-forest-based method by Lin and Wang (RF) 9 and structured-prediction-based method by Sonderegger and Keshet (SP). 8 All of these studies report results on the TIMIT database using the same validation criterion.…”
Section: Resultsmentioning
confidence: 99%
“…Methods for the measurement of VOT fall into two categories: (a) those which explicitly identify the locations of the burst and voicing onsets through a set of customized acoustic-phonetic rules (knowledge-based), 4,6 and (b) those which train a learning machine (such as random forest, support vector machine) to estimate the VOT using some acoustic features corresponding to the stop-to-voiced-phone transition event. 8,9 Many of the high performing methods require phonetic transcription either to identify the segment of the speech signal containing the stop consonant through forced-alignment 4,9 or to focus the analysis on segments of the signal containing only one stop consonant. 8 Such methods are difficult to employ in a scenario where there is no transcription available.…”
Section: Motivationmentioning
confidence: 99%