An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain

Khalil, Driss; Prasad, Amrutha; Motlicek, Petr; Zuluaga-Gomez, Juan; Nigmatulina, Iuliia; Madikeri, Srikanth; Schuepbach, Christof

doi:10.3390/aerospace10100876

Cited by 3 publications

(1 citation statement)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The ATCO2 platform [9] aims at collecting, pre-processing, and pseudoanonymizing ATC communications' audio databases of more than 5000 h of audio data with the objective of increasing robustness of speech recognition in the air traffic management domain. The ATCO2 corpus has also been used to detect speaker roles in voice communication, i.e., pilot or ATCO, and clustering speakers [10]. Given enough training data, automatic speech recognition and understanding systems also build the base to train ATCOs [11].…”

Section: Introductionmentioning

confidence: 99%

Ensuring Safety for Artificial-Intelligence-Based Automatic Speech Recognition in Air Traffic Control Environment

Pinska-Chauvin,

Helmke,

Dokic

et al. 2023

Aerospace

View full text Add to dashboard Cite

This paper describes the safety assessment conducted in SESAR2020 project PJ.10-W2-96 ASR on automatic speech recognition (ASR) technology implemented for air traffic control (ATC) centers. ASR already now enables the automatic recognition of aircraft callsigns and various ATC commands including command types based on controller–pilot voice communications for presentation at the controller working position. The presented safety assessment process consists of defining design requirements for ASR technology application in normal, abnormal, and degraded modes of ATC operations. A total of eight functional hazards were identified based on the analysis of four use cases. The safety assessment was supported by top-down and bottom-up modelling and analysis of the causes of hazards to derive system design requirements for the purposes of mitigating the hazards. Assessment of achieving the specified design requirements was supported by evidence generated from two real-time simulations with pre-industrial ASR prototypes in approach and en-route operational environments. The simulations, focusing especially on the safety aspects of ASR application, also validated the hypotheses that ASR reduces controllers’ workload and increases situational awareness. The missing validation element, i.e., an analysis of the safety effects of ASR in ATC, is the focus of this paper. As a result of the safety assessment activities, mitigations were derived for each hazard, demonstrating that the use of ASR does not increase safety risks and is, therefore, ready for industrialization.

show abstract

Section: Introductionmentioning

confidence: 99%

Ensuring Safety for Artificial-Intelligence-Based Automatic Speech Recognition in Air Traffic Control Environment

Pinska-Chauvin,

Helmke,

Dokic

et al. 2023

Aerospace

View full text Add to dashboard Cite

show abstract

Robust speech command recognition in challenging industrial environments

Bini,

Carletti,

Saggese

et al. 2024

Computer Communications

View full text Add to dashboard Cite

Safety and Workload Benefits of Automatic Speech Understanding for Radar Label Updates

Helmke,

Kleinert,

Ohneiser

et al. 2024

Journal of Air Transportation

View full text Add to dashboard Cite

Air traffic controllers (ATCos) quantified the benefits of automatic speech recognition and understanding (ASRU) on workload and flight safety. As a baseline procedure, ATCos manually enter all verbal clearances into the aircraft radar labels by mouse. In our proposed solution, ATCos are supported by ASRU, which is capable of delivering the required radar label updates automatically. ATCos need to visually review the ASRU-based label updates and only have to make corrections in case of misinterpretations. Overall, the amount of time required for manually inserting clearances, i.e., by selecting the correct input in the radar labels, was reduced from 12,700 s during 14 hours of simulation time down to 405 s when ATCos were supported by ASRU. Considering the additional time of mental workload for verifying ASRU output, there is still a saving of more than one-third of the time for radar label updates. This paper also considers safety aspects, i.e., how often incorrect inputs into aircraft radar labels occur with ASRU. The number of wrong or missing inputs is less than without ASRU support. This paper advances the use case that ASRU could potentially improve safety and efficiency for ATCo operations for arrivals.

show abstract

An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain

Cited by 3 publications

References 45 publications

Ensuring Safety for Artificial-Intelligence-Based Automatic Speech Recognition in Air Traffic Control Environment

Ensuring Safety for Artificial-Intelligence-Based Automatic Speech Recognition in Air Traffic Control Environment

Robust speech command recognition in challenging industrial environments

Safety and Workload Benefits of Automatic Speech Understanding for Radar Label Updates

Contact Info

Product

Resources

About