Proceedings of the Workshop on Speech and Natural Language - HLT '89 1989
DOI: 10.3115/1075434.1075456
|View full text |Cite
|
Sign up to set email alerts
|

The collection and preliminary analysis of a spontaneous speech database

Abstract: As part of our effort in developing a spoken language system for interactive problem solving, we recently collected a sizeable amount of speech data. This database is composed of spontaneous sentences which were collected during a simulated human/machine dialogue. Since a computer log of the spoken dialogue was maintained, we were able to ask the subjects to provide read versions of the sentences as well. This paper documents the data collection process, and provides some preliminary analyses of the collected … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

1989
1989
2007
2007

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 19 publications
(13 citation statements)
references
References 6 publications
0
12
0
Order By: Relevance
“…Finally, the appropriateness of the overall system response is assessed by a panel of naive subjects. Unless otherwise specified, all evaluations were done on the designated test set [3], consisting of 485 and 501 spontaneous and read sentences, respectively, spoken by 5 male and 5 female subjects. The average number of words per sentence is 7.7 and 7.6 for the spontaneous and read speech test sets, respectively.…”
Section: Performance Evaluationmentioning
confidence: 99%
“…Finally, the appropriateness of the overall system response is assessed by a panel of naive subjects. Unless otherwise specified, all evaluations were done on the designated test set [3], consisting of 485 and 501 spontaneous and read sentences, respectively, spoken by 5 male and 5 female subjects. The average number of words per sentence is 7.7 and 7.6 for the spontaneous and read speech test sets, respectively.…”
Section: Performance Evaluationmentioning
confidence: 99%
“…We first converted the system so that it could generate responses in Japanese. This enabled us to collect data from native speakers of Japanese in a wizard mode whereby an experimentor would translate the subjects' spoken input and type the resulting English queries to the system [3,9]. Once data were available we were able to port the speech recognition and language understanding components.…”
Section: Japanese Implementationmentioning
confidence: 99%
“…We have incorporated into summit a simple algorithm which uses scores for speech and silence based on the distributions of eight principle components of the mean rate response outputs of an auditory model [11]. We trained the system by collecting histograms of parameter distributions for phonetically transcribed utterances from a spontaneous-speech database [13]. The probability of speech is computed on a frame by frame basis, after some temporal smoothing.…”
Section: Boundary Modificationsmentioning
confidence: 99%