An Overview of End-to-End Automatic Speech Recognition

Wang, Dong; Wang, Xiaodong; Lv, Shaohe

doi:10.3390/sym11081018

Cited by 174 publications

(100 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The second approach is end-to-end speech recognition. It differs from sequential hierarchical analysis in that it allows you to analyze the original signal and move to higher levels of analysis (for example, the level of words), bypassing lower levels [17,18].…”

Section: Methods Of Syllable Recognitionmentioning

confidence: 99%

Evaluation of Speech Quality Through Recognition and Classification of Phonemes

2019

View full text Add to dashboard Cite

This paper discusses an approach for assessing the quality of speech while undergoing speech rehabilitation. One of the main reasons for speech quality decrease during the surgical treatment of vocal tract diseases is the loss of the vocal tractˈs parts and the disruption of its symmetry. In particular, one of the most common oncological diseases of the oral cavity is cancer of the tongue. During surgical treatment, a glossectomy is performed, which leads to the need for speech rehabilitation to eliminate the occurring speech defects, leading to a decrease in speech intelligibility. In this paper, we present an automated approach for conducting the speech quality evaluation. The approach relies on a convolutional neural network (CNN). The main idea of the approach is to train an individual neural network for a patient before having an operation to recognize typical sounding of phonemes for their speech. The neural network will thereby be able to evaluate the similarity between the patientˈs speech before and after the surgery. The recognition based on the full phoneme set and the recognition by groups of phonemes were considered. The correspondence of assessments obtained through the autorecognition approach with those from the human-based approach is shown. The automated approach is principally applicable to defining boundaries between phonemes. The paper shows that iterative training of the neural network and continuous updating of the training dataset gradually improve the ability of the CNN to define boundaries between different phonemes.

show abstract

Section: Methods Of Syllable Recognitionmentioning

confidence: 99%

Evaluation of Speech Quality Through Recognition and Classification of Phonemes

2019

View full text Add to dashboard Cite

show abstract

“…Additionally, end-to-end models have shown recent success in applications such as speech recognition and natural language processing [125], [126], [127], since they can bypass intermediate data processing steps that are typically present in traditional ML pipelines. In the context of clinical outcome prediction models, this requires major improvements in the collection and curation of EHR data across several dimensions, especially completeness, complexity, and accuracy.…”

Section: General Learning Modelsmentioning

confidence: 99%

Machine Learning for Clinical Outcome Prediction

Shamout

Zhu

Clifton

2021

IEEE Rev. Biomed. Eng.

116

View full text Add to dashboard Cite

Clinical decision-making in healthcare is already being influenced by predictions or recommendations made by data-driven machines. Numerous machine learning applications have appeared in the latest clinical literature, especially for outcome prediction models, with outcomes ranging from mortality and cardiac arrest to acute kidney injury and arrhythmia. In this review article, we summarize the stateof-the-art in related works covering data processing, inference, and model evaluation, in the context of outcome prediction models developed using data extracted from electronic health records. We also discuss limitations of prominent modeling assumptions and highlight opportunities for future research.

show abstract

“…Attention-based model is an end-to-end model of encoderdecoder. The attention mechanism eliminates the need for pre-segment alignment of data and can be used with implicitly learn the soft alignment between input and output sequences, avoiding the conditional independence hypothesis problem in CTC [48]. The encoder in attentionbased model converts the entire speech input sequence…”

Section: ) Sequence-to-sequence Modelsmentioning

confidence: 99%

Acoustic Modeling Based on Deep Learning for Low-Resource Speech Recognition: An Overview

Kang

Chen

et al. 2020

IEEE Access

View full text Add to dashboard Cite

The polarization of world languages is becoming more and more obvious. Many languages, mainly endangered languages, are of low-resource attribute due to lack of information. Both language conservation and cultural heritage face important challenges. Therefore, speech recognition for lowresource scenario has become a hot topic in the field of speech. Based on the complex network structures and huge model parameters, deep learning has become a powerful science in the process of speech recognition, which has a broad and far-reaching significance for the study of low-resource speech recognition. Aiming at the characteristic of low resource, this paper reviews the history and research status of two kinds of acoustic models of deep learning neural networks and acoustic end-to-end structures. We further elaborate on several key techniques for improving performance in the two aspects of data and model training. There are two projects for low-resource languages introduced in this paper. The possible future developments are finally pointed out. These works provide some reference for computer speech and language processing.

show abstract

An Overview of End-to-End Automatic Speech Recognition

Cited by 174 publications

References 50 publications

Evaluation of Speech Quality Through Recognition and Classification of Phonemes

Evaluation of Speech Quality Through Recognition and Classification of Phonemes

Machine Learning for Clinical Outcome Prediction

Acoustic Modeling Based on Deep Learning for Low-Resource Speech Recognition: An Overview

Contact Info

Product

Resources

About