Much of aviation safety reporting data consists of structured data e.g., digital flight data or radar data. However, safety report narratives, which come in the form of unstructured text data, are indispensable for safety reporting. Structured data alone is inadequate to capture all of the details of an incident while narratives can and do represent a myriad of details in a form that is natural for analysts to work with.However, large-scale analysis of narratives comes with many challenges: 1) it is difficult to employ enough human experts to digest the continuous flow of new incident reports 2) authors of incident reports use many different terms to refer to the same semantic concept, which makes it more difficult to determine if a specific concept occurs in texts 3) authors often make spelling mistakes and 4) authors use a wide variety of abbreviations for terms, some of which are nonstandard.These challenges can be mitigated by the intelligent use of Natural Language Processing (NLP) and Deep Learning techniques to automate parts of narrative processing. Specifically, we show how to use ensembles of word2vec models to automatically find semantically similar terms within safety report corpora and how to use a combination of human expertise and these ensemble models to identify sets of similar terms with greater recall 1 then either method alone.We also show an unsupervised method for comparing several word2vec models trained on the same data in order to estimate reasonable ranges of vector sizes to induce individual word2vec models. This method is based on measuring inter-model agreement on common word2vec similar terms.
Controller-pilot voice communications are a critical component of the Air Traffic Control (ATC) system, but outside of the human listening and responding that occurs with each transmission, they are an underutilized source of information for automation systems in the ATC domain. Automatic speech recognition is a continuously improving technology that can be used to tap into this information source for potential system benefits in a variety of ATC applications, such as monitoring live operations for safety benefit, conducting analysis on large quantities of recorded controller-pilot speech, or enabling automated simulation pilots to facilitate training and Human-in-the-Loop (HITL) simulation experiments. This paper describes how automatic speech recognition can be used in the ATC domain, the characteristics of the automatic speech recognition process and the ATC domain that make the problem unique, and the engineering process for effectively applying automatic speech recognition to ATC systems.Some of the information transmitted by voice communications is manually entered into automation systems in particular situations (e.g., a pilot entering a clearance into a flight management system or an en route controller entering an interim altitude in the Host Computer System), but for the most part, voice communication information is not captured in a way that is immediately text searchable or analyzable without intermediary human interpretation and processing. Controller-pilot voice communication, therefore, is an underutilized source of information within the ATC system, and automatic speech recognition can help unlock this potentially valuable information source.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.