ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053366
|View full text |Cite
|
Sign up to set email alerts
|

An Attention Enhanced Multi-Task Model for Objective Speech Assessment in Real-World Environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 34 publications
(29 citation statements)
references
References 25 publications
0
29
0
Order By: Relevance
“…However, it was developed for narrow-band applications and works well on limited impairment types. Recently, Deep Neural Networks (DNN) based approaches have been proposed to estimate the speech quality scores [6,7,8]. Some of these learning-based approaches use other objective metrics as the ground truth to train their speech quality predictor.…”
Section: Introductionmentioning
confidence: 99%
“…However, it was developed for narrow-band applications and works well on limited impairment types. Recently, Deep Neural Networks (DNN) based approaches have been proposed to estimate the speech quality scores [6,7,8]. Some of these learning-based approaches use other objective metrics as the ground truth to train their speech quality predictor.…”
Section: Introductionmentioning
confidence: 99%
“…Note that the comparison approaches: DNN, Quality-Net, NISQA and pBi-LSTM+Att are all trained and evaluated for a single target each time, while our approach assesses the speech from different perspectives at the same time. Therefore, we further compare our approach with our prior work (i.e., AMSA [23]) that is capable of estimating multiple objective targets. Results still demonstrate the superiority of our system in all these objective targets, where joint subjective and objective assessment improves performance.…”
Section: Experimental Results and Analysismentioning
confidence: 99%
“…The models are trained with 100 epochs using Adam optimizer and all models are trained and evaluated separately on COSINE and VOiCES datasets. We include 5 non-intrusive data-driven models as comparison approaches, including a multi-task model for objective score estimation (AMSA) [23], a deep neural network (DNN) model [18], Quality-Net [14], NISQA [13] and pBi-LSTM+Att [22]. Note that all these data-driven models except AMSA are separately trained for each target since they are not designed for multi-task estimation.…”
Section: Network Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…Multi-task learning (MTL) [20] is an approach in deep learning when the model performs at least two tasks. MTL has been successfully applied in various fields [20] including speech (e.g., speech recognition [21], speech enhancement [22], or objective speech assessment in real-world environments by generating several objective intelligibility and quality scores [23]).…”
Section: Non-intrusive Multi-task Transfer Learning-based Speech Intelligibility Modelmentioning
confidence: 99%