2021
DOI: 10.1109/access.2021.3112165
|View full text |Cite
|
Sign up to set email alerts
|

Marginal Effects of Language and Individual Raters on Speech Quality Models

Abstract: Speech quality is often measured via subjective testing, or with objective estimators of mean opinion score (MOS) such as ViSQOL or POLQA. Typical MOS-estimation frameworks use signal level features but do not use language features that have been shown to have an effect on opinion scores. If there is a conditional dependence between score and language given these signal features, introducing language and rater predictors should provide a marginal improvement in predictions. The proposed method uses Bayesian mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…The tool extends possible ways to validate data consistency of responses obtained during a MQA subjective experiment [8]. Importantly, our work was noticed by practitioners in the MQA field and referred to in [9], [10], and [11].…”
Section: Introductionmentioning
confidence: 75%
“…The tool extends possible ways to validate data consistency of responses obtained during a MQA subjective experiment [8]. Importantly, our work was noticed by practitioners in the MQA field and referred to in [9], [10], and [11].…”
Section: Introductionmentioning
confidence: 75%
“…In parallel to the deep learning improvements that are mostly driven by extracting more useful information from the waveform, researchers have made progress in obtaining a better understanding of the biases and factors of listening tests that are independent of the speech signal being rated [15]. For example, research has found that a significant amount of bias may be attributable to properties of the listeners, including their language and culture, as well as their individual tendencies to rate high or low [16,17]. LDNet [18], a baseline model for this challenge, considers rater metadata.…”
Section: Introductionmentioning
confidence: 99%
“…Despite their promising results, data-driven models are exposed to bias depending on the type of data used to train them. Collecting data that does not bias the model is a challenge (specifically neutral to: speaker voice/accent, language, degradation) [9,10,11].…”
Section: Introductionmentioning
confidence: 99%