Interspeech 2008 2008
DOI: 10.21437/interspeech.2008-415
|View full text |Cite
|
Sign up to set email alerts
|

Development of the SRI/nightingale Arabic ASR system

Abstract: We describe the large vocabulary automatic speech recognition system developed for Modern Standard Arabic by the SRI/Nightingale team, and used for the 2007 GALE evaluation as part of the speech translation system. We show how system performance is affected by different development choices, ranging from text processing and lexicon to decoding system architecture design. Word error rate results are reported on broadcast news and conversational data from the GALE development and evaluation test sets.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…The basic acoustic models are trained based on Maximum Likelihood (ML) method. Then, a discriminative training based on Minimum Phone Error (MPE) criterion is performed to enhance the models [15,16].…”
Section: Methodsmentioning
confidence: 99%
“…The basic acoustic models are trained based on Maximum Likelihood (ML) method. Then, a discriminative training based on Minimum Phone Error (MPE) criterion is performed to enhance the models [15,16].…”
Section: Methodsmentioning
confidence: 99%
“…There has been a lot of process on this task over the last couple of years, see e. g. [3,4,5,6,7]. This paper describes the progress of work at CMU since our initial efforts in 2006 [8], using the JRTk/ Ibis toolkit [9].…”
Section: The Gale Speech-to-text Taskmentioning
confidence: 99%
“…Arabic spoken corpora have been primarily gathered from radio and television news broadcasts and phone calls [12]. Because of the limitations of the available spoken corpora, Arabic ASR research and applications have been limited to particular domains, such as Arabic digits [15] [16], broadcast news [19], command and control [15], The Holy Qur'an [15] [24], and Arabic proverbs [20]. Limited text and speech Arabic corpora are also a major problem for Arabic ASR researchers who are seeking to apply Arabic ASR to a broader range of applications.…”
Section: Introductionmentioning
confidence: 99%