IberSPEECH 2022 2022
DOI: 10.21437/iberspeech.2022-56
|View full text |Cite
|
Sign up to set email alerts
|

BCN2BRNO: ASR System Fusion for Albayzin 2022 Speech to Text Challenge

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…This submission leveraged the output of some of the ASR systems developed by the BCN2BRNO team for the Speech to Text Challenge [21]. It consisted of a primary system based on a fusion of three ASR systems (two of them based on an encoderdecoder transformer architecture: XLS-R Conformer and Whisper large model, and the third one based on an RNN transducer architecture) and a contrastive system based on the best single ASR system (XLS-R Conformer).…”
Section: Alignment and Validation Of Speech Signals With Partial And ...mentioning
confidence: 99%
See 1 more Smart Citation
“…This submission leveraged the output of some of the ASR systems developed by the BCN2BRNO team for the Speech to Text Challenge [21]. It consisted of a primary system based on a fusion of three ASR systems (two of them based on an encoderdecoder transformer architecture: XLS-R Conformer and Whisper large model, and the third one based on an RNN transducer architecture) and a contrastive system based on the best single ASR system (XLS-R Conformer).…”
Section: Alignment and Validation Of Speech Signals With Partial And ...mentioning
confidence: 99%
“…Speech to Text ChallengeA total of 13 different systems from four participating teams were submitted. The most relevant characteristics of each system are presented in terms of the recognition engine, and audio and text data used for training acoustic and language models.• BCN2BRNO[21]. BUT Speech@FIT research group (Brno University of Technology, Czech Republic) and Telefónica Research (Spain) BCN2BRNO submitted a primary system based on a word-level ROVER fusion of five individual models.…”
mentioning
confidence: 99%