2020
DOI: 10.48550/arxiv.2002.00122
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multi-channel Acoustic Modeling using Mixed Bitrate OPUS Compression

Abstract: Recent literature has shown that a learned front end with multi-channel audio input can outperform traditional beamforming algorithms for automatic speech recognition (ASR). In this paper, we present our study on multi-channel acoustic modeling using OPUS compression with different bitrates for the different channels. We analyze the degradation in word error rate (WER) as a function of the audio encoding bitrate and show that the WER degrades by 12.6% relative with 16kpbs as compared to uncompressed audio. We … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…While Opus has been found to be effective for single-and multi-channel ASR applications [11], it has been optimized for subjective quality, not for ASR accuracy. Moreover, its channel coupling has been optimized for human spatial perception, where the inter-aural phase difference (IPD) [20] is only noticeable at lower frequencies.…”
Section: Opus Compression and Proposed Optimizationmentioning
confidence: 99%
See 2 more Smart Citations
“…While Opus has been found to be effective for single-and multi-channel ASR applications [11], it has been optimized for subjective quality, not for ASR accuracy. Moreover, its channel coupling has been optimized for human spatial perception, where the inter-aural phase difference (IPD) [20] is only noticeable at lower frequencies.…”
Section: Opus Compression and Proposed Optimizationmentioning
confidence: 99%
“…We here focus on Opus compression [9] due to its widespread use in voice over IP (VoIP) and ASR applications [10]. While Opus has been found to be effective for singleand multi-channel ASR applications [11], it is in some sense mismatched to the task at hand. First, like all lossy speech and audio compression methods, Opus has been optimized for subjective quality, not for ASR or subsequent spatial filtering.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation