2007
DOI: 10.3844/ajassp.2007.23.32
|View full text |Cite
|
Sign up to set email alerts
|

CASRA+: A Colloquial Arabic Speech Recognition Application

Abstract: The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direct word template ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2010
2010
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…The rate at which zero crossings occur is a simple measure of the frequency content of a signal. Zero crossing rate is therefore a measure of number of times in a given time interval that the amplitude of the signals passes through a value of zero (Haraty and Ariss, 2007).…”
Section: Long Term Spectrummentioning
confidence: 99%
“…The rate at which zero crossings occur is a simple measure of the frequency content of a signal. Zero crossing rate is therefore a measure of number of times in a given time interval that the amplitude of the signals passes through a value of zero (Haraty and Ariss, 2007).…”
Section: Long Term Spectrummentioning
confidence: 99%
“…By converting the acoustic signal obtained from a microphone or a telephone the speech recognition process generates a set of words (Singh et al 2010;Othman and Riadh 2008). In order to extract and determine the linguistic information conveyed by a speech wave we have to employ computers or electronic circuits (Haraty and El Ariss 2007). This process is utilized for several applications like security device, household appliances, cellular phones, automated teller machines (ATM) and computers (Patel and Rao 2010) Gender classification is applied in many fields.…”
Section: Introductionmentioning
confidence: 99%
“…Perceptual-based evaluation of human raters is not only to simply value non-native utterances as accepted/rejected but also to analyze and locate specific errors on segmental aspects. Further, the acoustic model adaptation is combined with three speaker adaptation techniques Maximum Likelihood Linear Regression (MLLR) as proposed in (Goronzy et al, 2004;Giuliani et al, 2006;Haraty and El Ariss, 2007), Constrained MLLR (CMLLR) and Vocal Track Length ormalization (VTLN) as proposed in (Hariharan et al, 2002;Sundermann et al, 2003;Legetter and Woodland, 1995;Shen and Reynolds, 2008;Al-Haddad et al, 2009;Gales and Young, 2008) in order to eliminate interspeaker variability. Performance of the proposed acoustic model adaptation is evaluated in five measures of alignment analysis between recognition results and perceptual based evaluation: Hit, False Alarm (FA), Miss, Rejection and Hit + Rejection.…”
Section: Introductionmentioning
confidence: 99%