Interacting with computers by voice: automatic speech recognition and synthesis

O’Shaughnessy, D.

doi:10.1109/jproc.2003.817117

Cited by 95 publications

(55 citation statements)

References 231 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The cepstral based features, MFCC and PLP, are expectedly better due to the better following of auditory scale. Similar results are reported for other languages as well [4]. According to the slightly better achievement of the MFCC over PLP features for acoustic modeling in Croatian LVASR the use of MFCC speech feature vectors is proposed.…”

Section: Speech Feature Vectorssupporting

confidence: 72%

“…The statistical approach uses hidden Markov models (HMM) as state of the art formalism for speech recognition. Many large vocabulary automatic speech recognition (LVASR) systems use mel-cepstral speech analysis, hidden Markov modeling of acoustic subword units, n-gram language models (LM) and n-best search of word hypothesis [1,3,4,5]. Automatic speech recognition research in languages like English, German and Japanese [6] puts its focus on recognition of spontaneous and broadcast speech.…”

Section: Introduction and Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Croatian Large Vocabulary Automatic Speech Recognition

2011

View full text Add to dashboard Cite

Original scientific paperThis paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR). The proposed acoustic model is based on context-dependent triphone hidden Markov models and Croatian phonetic rules. Different acoustic and language models, developed using a large collection of Croatian speech, are discussed and compared. The paper proposes the best feature vectors and acoustic modeling procedures using which lowest word error rates for Croatian speech are achieved. In addition, Croatian language modeling procedures are evaluated and adopted for speaker independent spontaneous speech recognition. Presented experiments and results show that the proposed approach for automatic speech recognition using context-dependent acoustic modeling based on Croatian phonetic rules and a parameter tying procedure can be used for efficient Croatian large vocabulary speech recognition with word error rates below 5%.Key words: Acoustic modeling, Automatic speech recognition, Context-dependent acoustic units, Language modelingAutomatsko raspoznavanje hrvatskoga govora velikoga vokabulara.Članak prikazuje postupke akustičkog i jezičnog modeliranja sustava za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara. Predloženi akustički modeli su zasnovani na kontekstno-ovisnim skrivenim Markovljevim modelima trifona i hrvatskim fonetskim pravilima. Na hrvatskome govoru prikupljenom u korpusu su ocjenjeni i usporeeni različiti akustički i jezični modeli. Učlanku su usporeeni i predloženi postupci za izračun vektora značajki za akustičko modeliranje kao i sam pristup akustičkome modeliranju hrvatskoga govora s kojim je postignuta najmanja mjera pogrešno raspoznatih riječi. Predstavljeni su rezultati raspoznavanja spontanog hrvatskog govora neovisni o govorniku. Postignuti rezultati eksperimenata s mjerom pogreške ispod 5% ukazuju na primjerenost predloženih postupaka za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara pomoću vezanih kontekstno-ovisnih akustičkih modela na osnovu hrvatskih fonetskih pravila.

show abstract

Section: Speech Feature Vectorssupporting

confidence: 72%

Section: Introduction and Related Workmentioning

confidence: 99%

Croatian Large Vocabulary Automatic Speech Recognition

2011

View full text Add to dashboard Cite

show abstract

“…In HMM we mixture multi vibrate Gaussian distribution, probabilistic mean, variance and mixture weight for speech [19]. Each phoneme has different output distribution.…”

Section:  Hidden Markov Modelmentioning

confidence: 99%

Speech Recognition System – A Review

S.¹,

Deshmukh²

2016

IOSR

View full text Add to dashboard Cite

“…It takes time P to process an input of duration I. It is defined by the formula [1] as given below RTF = P I…”

Section: Performance Measurement Of Speech Recognition Approachesmentioning

confidence: 99%

Speech Feature Extraction and Classification: A Comparative Review

Madan¹,

Gupta²

2014

IJCA

View full text Add to dashboard Cite

This paper gives a brief survey on speech recognition and presents an overview for various techniques used at various stages of speech recognition systems. Researchers has been working in this research area for many years however accuracy for speech recognition still attention for variation of context, speaker's variability, environment conditions .The development of speech recognition system requires certain concepts to be included-Defining different classes of speech, techniques for speech feature extraction, speech classification modeling and measuring system performance .The main aim of this paper is to discuss and compare different approaches used for feature extraction and classification stages in speech recognition system.

show abstract

Interacting with computers by voice: automatic speech recognition and synthesis

Cited by 95 publications

References 231 publications

Croatian Large Vocabulary Automatic Speech Recognition

Croatian Large Vocabulary Automatic Speech Recognition

Speech Recognition System – A Review

Speech Feature Extraction and Classification: A Comparative Review

Contact Info

Product

Resources

About