7th European Conference on Speech Communication and Technology (Eurospeech 2001) 2001
DOI: 10.21437/eurospeech.2001-484
|View full text |Cite
|
Sign up to set email alerts
|

The IFA corpus: a phonemically segmented dutch "open source" speech database

Abstract: An open source database of hand-segmented Dutch speech was constructed with off-the-shelf software using speech from 8 speakers in a variety of speaking styles. For a total of 50,000 words, speech acquisition and preparation took around 3 person-weeks per speaker. Hand segmentation took 1,000 hours of labeling altogether. The asymptotic segmentation speed was about one word, or four boundaries, per minute. An evaluation showed that the Median Absolute Difference of the segment boundaries was 6 ms between label… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2001
2001
2022
2022

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 32 publications
(7 citation statements)
references
References 3 publications
(4 reference statements)
0
6
0
Order By: Relevance
“…Each run consisted of 100 Dutch words, which were displayed for two seconds followed by a fixation cross for one second. Words were drawn randomly from a phonetically balanced list of 250 words 19 .…”
mentioning
confidence: 99%
“…Each run consisted of 100 Dutch words, which were displayed for two seconds followed by a fixation cross for one second. Words were drawn randomly from a phonetically balanced list of 250 words 19 .…”
mentioning
confidence: 99%
“…Stimuli were Pseudosentences from the IFA corpus read by 10 Dutch speakers (5F) (Van Son et al, 2001). In Experiment 1, the parameters of the male-like Long VTL target are 𝜙 = 510 Hz, F 0 = 120 Hz, rate = 3.8 syll/s.…”
Section: Methodsmentioning
confidence: 99%
“…To pseudonymize the formants, the target frequencies, represented in terms of VTLs 𝜙 𝑖 = 𝐹 𝑖 ∕(2𝑖 − 1), can be randomly chosen in the range 𝜙 𝑖 ±40 and 𝜙 𝑖 ±75 Hz for F 0−1 and F 2−5 , respectively, and the intensities can be randomly chosen in the range 64 ± 4.5, 67 ± 2.5, 58 ± 4.5, 50±8, 47 ± 10, 45±9 dB (I 0−5 ±2SD), where SD denotes standard deviation. These values were chosen based on ranges found in the speakers in the IFA corpus (Van Son et al, 2001) (5M/5F, see Experiment 1, Section 3.1.1). In an alternative setting where a given speaker is to be pseudonymized to a specific target speaker, the parameters 𝜙, 𝜙 𝑖 (𝑖 = 1...5), 𝐼 𝑖 (𝑖 = 0...5) and speaking rate can be pre-computed across several of the target speaker's utterances (preferably over 300 s spoken in a comparable style) and used.…”
Section: Target Parameters For Pseudonymizationmentioning
confidence: 99%
“…As two representative examples let me shortly describe the 10-millions words Spoken Dutch Corpus (CGN) [17] and the much smaller but fully phonemically segmented IFA-corpus [25].…”
Section: Speech Databasesmentioning
confidence: 99%
“…All compiled data tables are fed into a PostgresSQL database for subsequent data mining. Some initial results about speaking rates and phoneme durations are already available [25], but more intricate questions can hopefully be solved using the powerful query language SQL. The corpus is freely available and accessible on line under the GNU General Public License (http://www.fon.hum.uva.nl/ IFAcorpus/).…”
Section: Speech Databasesmentioning
confidence: 99%