2020
DOI: 10.14569/ijacsa.2020.0110455
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic Modeling in Speech Recognition: A Systematic Review

Abstract: The paper presents a systematic review of acoustic modeling (AM) techniques in speech recognition(SR). Acoustic modeling establishes a relationship between acoustic information and language construct in SR. Over the past decades, researchers presented studies addressing specific concerns in AM. However, all previous research works lack a systematic and comprehensive review of acoustic modeling issues. A systematic review is introduced to understand the acoustic modeling issues in speech recognition. This paper… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 94 publications
0
3
0
Order By: Relevance
“…In simple terms, lack of data hampers the direct application of "classical" approaches to automatic speech recognition and translation, which usually imply the use of acoustic, lexical, and grammatical (language) models. Usually, an automatic speech recognition (ASR) system (the "standard" approach) consists of an acoustic model (AM) that establishes the relationship between acoustic information and allophones of a language at issue [21], a language model (LM) necessary for building hypotheses of a recognized utterance, and a vocabulary of lexical units with phonetic transcriptions. The training of acoustic models involves utilizing a speech corpus, while the development of the language model draws upon probabilistic modeling using available target language texts (as illustrated in Figure 1).…”
Section: Low-resource Languages: Data Scarcity Challenge 21 Low-resou...mentioning
confidence: 99%
“…In simple terms, lack of data hampers the direct application of "classical" approaches to automatic speech recognition and translation, which usually imply the use of acoustic, lexical, and grammatical (language) models. Usually, an automatic speech recognition (ASR) system (the "standard" approach) consists of an acoustic model (AM) that establishes the relationship between acoustic information and allophones of a language at issue [21], a language model (LM) necessary for building hypotheses of a recognized utterance, and a vocabulary of lexical units with phonetic transcriptions. The training of acoustic models involves utilizing a speech corpus, while the development of the language model draws upon probabilistic modeling using available target language texts (as illustrated in Figure 1).…”
Section: Low-resource Languages: Data Scarcity Challenge 21 Low-resou...mentioning
confidence: 99%
“…The combination of the vocal strings with verbalization can create an assortment of discourse. The discourse quality, counting unforgiving, tense, breathy, or whispery voice, can be influenced by emotion and temperament [2]. In the last decade, there has been an automated method of identifying words spoken by the human voice and converting them into readable text.…”
Section: Introductionmentioning
confidence: 99%
“…It comes, however, at the cost of performing inferences with huge models that require on the order of billions of arithmetic operations per second of speech and expensive searches in large graphs, such as lexicon and language models graphs. Another challenge for on-edge ASR comes from the wide diversity of ASR systems [13]. ASR systems come in one of two major flavors: Hybrid DNN-HMM systems [14], [15], [16] and End-to-End systems [11], [17], [18], [19], [20].…”
Section: Introductionmentioning
confidence: 99%