2021
DOI: 10.1016/j.asoc.2021.107847
|View full text |Cite
|
Sign up to set email alerts
|

ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 17 publications
(5 citation statements)
references
References 46 publications
(59 reference statements)
0
5
0
Order By: Relevance
“…In this way, the complicated architecture is supposed to impose training burdens and lead to the gradient vanishing problem. In this work, the residual mechanism is applied to the LSTM layers to improve its FIGURE 4 The residual BLSTM scheme trainability, and further to obtain better model convergence and final performance. In addition, since the ASR is a sequential classification task, the bidirectional mechanism is also performed on the LSTM layers to formulate the BLSTM layer, which benefits to improve the modelling accuracy from the past and future direction simultaneously.…”
Section: Residual Lstmsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this way, the complicated architecture is supposed to impose training burdens and lead to the gradient vanishing problem. In this work, the residual mechanism is applied to the LSTM layers to improve its FIGURE 4 The residual BLSTM scheme trainability, and further to obtain better model convergence and final performance. In addition, since the ASR is a sequential classification task, the bidirectional mechanism is also performed on the LSTM layers to formulate the BLSTM layer, which benefits to improve the modelling accuracy from the past and future direction simultaneously.…”
Section: Residual Lstmsmentioning
confidence: 99%
“…In the current ATC management system, the ATC is a non-automatic procedure (human-in-the-loop) and is always regarded as a potential risk for the air traffic operation [1]. Numerous studies have demonstrated that monitoring the control conversation is a promising way to obtain real-time traffic dynamics [2,3,4], which benefits to formulate a closed-loop ATC management. To this end, the automatic speech recognition (ASR) technique, with the purpose of building the bridge between the human (ATCO and pilot) and machine (ATC systems), has attracted significant attention worldwide in the ATC domain.…”
Section: Introductionmentioning
confidence: 99%
“…Considering that the end-to-end ASR systems are often the most efficient method and deliver competitive quality in recent years [12,22,24,34], a connectionist temporal classification (CTC) based model referring to Deepspeech 2 [35] is introduced to serve as the AM in this work. In general, the AM model consists of convolutional neural networks (CNN), recurrent neural network (RNN), and fully connected (FC) layers.…”
Section: The Acoustic Modelmentioning
confidence: 99%
“…An exploratory benchmark of several advanced ASR models trained on ATC corpus was presented in [10]. Semi-supervised Learning [11] and representation learning [12,13] approaches were also introduced to leverage abundant untranscribed speech data to improve ASR performance in the ATC domain. Furthermore, an ASR and callsign detection challenge of the ATC was held by the Airbus company in 2018 [14].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation