2019
DOI: 10.3390/sym11081018
|View full text |Cite
|
Sign up to set email alerts
|

An Overview of End-to-End Automatic Speech Recognition

Abstract: Automatic speech recognition, especially large vocabulary continuous speech recognition, is an important issue in the field of machine learning. For a long time, the hidden Markov model (HMM)-Gaussian mixed model (GMM) has been the mainstream speech recognition framework. But recently, HMM-deep neural network (DNN) model and the end-to-end model using deep learning has achieved performance beyond HMM-GMM. Both using deep learning techniques,

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
74
0
3

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 174 publications
(100 citation statements)
references
References 50 publications
0
74
0
3
Order By: Relevance
“…The second approach is end-to-end speech recognition. It differs from sequential hierarchical analysis in that it allows you to analyze the original signal and move to higher levels of analysis (for example, the level of words), bypassing lower levels [17,18].…”
Section: Methods Of Syllable Recognitionmentioning
confidence: 99%
“…The second approach is end-to-end speech recognition. It differs from sequential hierarchical analysis in that it allows you to analyze the original signal and move to higher levels of analysis (for example, the level of words), bypassing lower levels [17,18].…”
Section: Methods Of Syllable Recognitionmentioning
confidence: 99%
“…Additionally, end-to-end models have shown recent success in applications such as speech recognition and natural language processing [125], [126], [127], since they can bypass intermediate data processing steps that are typically present in traditional ML pipelines. In the context of clinical outcome prediction models, this requires major improvements in the collection and curation of EHR data across several dimensions, especially completeness, complexity, and accuracy.…”
Section: General Learning Modelsmentioning
confidence: 99%
“…Attention-based model is an end-to-end model of encoderdecoder. The attention mechanism eliminates the need for pre-segment alignment of data and can be used with implicitly learn the soft alignment between input and output sequences, avoiding the conditional independence hypothesis problem in CTC [48]. The encoder in attentionbased model converts the entire speech input sequence…”
Section: ) Sequence-to-sequence Modelsmentioning
confidence: 99%