The 2019 FEARLESS STEPS (FS-1) Challenge is an initial step to motivate a streamlined and collaborative effort from the speech and language community towards addressing massive naturalistic audio, the first of its kind. The Fearless Steps Corpus is a collection of 19,000 hours of multi-channel recordings of spontaneous speech from over 450 speakers under multiple noise conditions. A majority of the Apollo Missions original analog data is unlabeled and has thus far motivated the development of both unsupervised and semi-supervised strategies. This edition of the challenge encourages the development of core speech and language technology systems for data with limited groundtruth/low resource availability and is intended to serve as the "First Step" towards extracting high-level information from such massive unlabeled corpora. In conjunction with the Challenge, 11,000 hours of synchronized 30-channel Apollo-11 audio data has also been released to the public by CRSS-UTDallas. We describe in this paper the Fearless Steps Corpus, Challenge Tasks, their associated baseline systems, and results. In conclusion, we also provide insights gained by the CRSS-UTDallas team during the inaugural Fearless Steps Challenge.
The Fearless Steps Initiative by UTDallas-CRSS led to the digitization, recovery, and diarization of 19,000 hours of original analog audio data, as well as the development of algorithms to extract meaningful information from this multi-channel naturalistic data resource. The 2020 FEARLESS STEPS (FS-2) Challenge is the second annual challenge held for the Speech and Language Technology community to motivate supervised learning algorithm development for multi-party and multi-stream naturalistic audio. In this paper, we present an overview of the challenge sub-tasks, data, performance metrics, and lessons learned from Phase-2 of the Fearless Steps Challenge (FS-2). We present advancements made in FS-2 through extensive community outreach and feedback. We describe innovations in the challenge corpus development, and present revised baseline results. We finally discuss the challenge outcome and general trends in system development across both phases (Phase FS-1 Unsupervised, and Phase FS-2 Supervised) of the challenge, and its continuation into multi-channel challenge tasks for the upcoming Fearless Steps Challenge Phase-3.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.