2008
DOI: 10.1109/icassp.2008.4517924
|View full text |Cite
|
Sign up to set email alerts
|

Phonetic pronunciations for arabic speech-to-text systems

Abstract: In this paper two aspects of generating and using phonetic Arabic dictionaries are described. First, the use of single pronunciation acoustic models in the context of Arabic large vocabulary Automatic Speech Recognition (ASR) is investigated. These have been found to be useful for English ASR systems, when combined with standard multiple pronunciation systems. The second area examined is automatically deriving phonetic "pronunciations" for words that standard approaches, such as the Buckwalter Morphological An… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2009
2009
2016
2016

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 7 publications
(17 reference statements)
0
6
0
Order By: Relevance
“…Hard segmentation is obtained by assigning each voxel to the tissue class with the highest probability. In order to recover from singular covariance matrices, a regularization constant of 1e − 3 was added to the diagonal elements of the covariance matrices (Diehl et al, 2011). …”
Section: Methodsmentioning
confidence: 99%
“…Hard segmentation is obtained by assigning each voxel to the tissue class with the highest probability. In order to recover from singular covariance matrices, a regularization constant of 1e − 3 was added to the diagonal elements of the covariance matrices (Diehl et al, 2011). …”
Section: Methodsmentioning
confidence: 99%
“…Pass three performs acoustic lattice rescoring, applying constrained Maximum Likelihood Linear Regression (MLLR) followed by lattice-MLLR and Confusion Network (CN) decoding. Further details on the training and decoding configuration can be found in [9].…”
Section: System Descriptionmentioning
confidence: 99%
“…Out-of-vocabulary rates for the 260k and 350k wordlists is the subset of the 350K word-list for which phonetic pronunciations could be obtained using Buckwalter. The 90K missing pronunciations were found using automatically derived rules described in [11]. The out-of-vocabulary (OOV) rates for the three test sets used are shown in table 2.…”
Section: System Descriptionmentioning
confidence: 99%
“…The second, a phonetic system (V1), was based on 39 phones, the graphemic ones plus the three short vowels. For further details of the two systems see [11]. Both models used a 39-dimensional PLP-based front-end which used 13 PLP cepstra, including the zeroth cepstral coefficient with first, second and third delta parameters appended followed by an HLDA projection from 52-dimensions down to 39.…”
Section: System Descriptionmentioning
confidence: 99%
See 1 more Smart Citation