Advances in Arabic broadcast news transcription at RWTH

Rybach, David; Hahn, Stefan; Gollan, Christian; Schlüter, Ralf; Ney, Hermann

doi:10.1109/asru.2007.4430154

Cited by 14 publications

(14 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Many developments have been done in the field of voice recognition [9] and mostly all the developed proposed methods are based on: Voice analysis, feature extraction, modeling and matching [10]. These methods [11], [12], [13] and [14] are mostly suffering from the efforts needed to perform the process of voice recognition and the lowest recognition ratio. A method proposed in [15] is a good example of one of the current development which is based on extracting voice features based on voice analysis to calculate some parameters such as estimated population (mu), dynamic range, peak factor, Power spectral density, and zero crossing rate.…”

Section: Fig 2: Histogram Of the Birdwavmentioning

confidence: 99%

A Novel Methodology to Extract Voice Signal Features

Khawatreh¹,

Ayyoub²,

Abu-Ein³

et al. 2018

IJCA

View full text Add to dashboard Cite

A novel methodology to manipulate wave file and create a feature array for each wave file will be introduced, this array can be used later on to recognize the voice file. A set of experiments will be performed in order to prove the uniqueness of the calculated feature array, and that the created feature array for a certain wave file does not match any other feature array for other wave files. The proposed methodology will minimize the efforts of voice recognition by mean of minimizing the time of feature array creation and minimizing the size of the calculated array. General TermsVoice recognitions, artificial intelligence KeywordsWave file, feature array, histogram.

show abstract

Section: Fig 2: Histogram Of the Birdwavmentioning

confidence: 99%

A Novel Methodology to Extract Voice Signal Features

Khawatreh¹,

Ayyoub²,

Abu-Ein³

et al. 2018

IJCA

View full text Add to dashboard Cite

show abstract

“…The first two bnad06 and bcad06 were defined by BBN technology and consist of about 3 hours of BN and BC data respectively (collected during Dec05-Jan06). These test sets comprise com- 3 For the SPron system this only distinguished between a silence model being at the end of a word or not. …”

Section: Multi-pass Combination Frameworkmentioning

confidence: 99%

“…As part of the training process it is necessary to obtain pronunciations for words that can not be handled by Buckwalter [4,3]. A series of rules were automatically generated from a 250K Buckwalter derived phonetic dictionary.…”

Section: Automatic Pronunciation Generationmentioning

confidence: 99%

“…Minimum Phone Error (MPE) discriminative training was used to train all the acoustic models. Pronunciation probabilities were used for the phonetic systems 3 . The N-gram language models were trained on data from 22 Arabic sources, including the acoustic transcriptions.…”

Section: Acoustic and Language Modelsmentioning

confidence: 99%

“…Recently there has been much interest in the problems associated with transcribing Arabic audio [1,2,3]. There are a number of issues to be addressed for success due to the nature of the Arabic language.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Phonetic pronunciations for arabic speech-to-text systems

Diehl

Gales

Tomalin

et al. 2008

2008 IEEE International Conference on Acoustics, Speech and Signal Processing

View full text Add to dashboard Cite

In this paper two aspects of generating and using phonetic Arabic dictionaries are described. First, the use of single pronunciation acoustic models in the context of Arabic large vocabulary Automatic Speech Recognition (ASR) is investigated. These have been found to be useful for English ASR systems, when combined with standard multiple pronunciation systems. The second area examined is automatically deriving phonetic "pronunciations" for words that standard approaches, such as the Buckwalter Morphological Analyzer, cannot handle. Without pronunciations for these words the OOV rates for various Arabic tasks significantly increase. Here, pronunciations are automatically found by first deriving grapheme-to-phone rules, and associated rule probabilities. These are then used to produce the most likely pronunciation, or pronunciations, for any word. These approaches are evaluated on a large vocabulary Arabic Broadcast News and Broadcast Conversation transcription task. Both schemes are found to yield gains with a multi-pass/combination framework.

show abstract