2019
DOI: 10.1016/j.specom.2019.10.006
|View full text |Cite
|
Sign up to set email alerts
|

Deep-learning-based audio-visual speech enhancement in presence of Lombard effect

Abstract: Keywords:Lombard effect audio-visual speech enhancement deep learning speech quality speech intelligibility A B S T R A C T When speaking in presence of background noise, humans reflexively change their way of speaking in order to improve the intelligibility of their speech. This reflex is known as Lombard effect. Collecting speech in Lombard conditions is usually hard and costly. For this reason, speech enhancement systems are generally trained and evaluated on speech recorded in quiet to which noise is artif… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
49
0
3

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 32 publications
(55 citation statements)
references
References 250 publications
3
49
0
3
Order By: Relevance
“…As seen from this discussion, not only the experimental setup was different compared to our approach, but the analysis also differs from that performed by us; thus a direct comparison is not possible. Even the observation of what SNR value the model does not work at seems to be uncommon; in the case of our research, the models stop working at a threshold of -5 dB, in the work of Michelsanti et al [57], it refers to 5 dB.…”
Section: B Results Analysismentioning
confidence: 51%
See 4 more Smart Citations
“…As seen from this discussion, not only the experimental setup was different compared to our approach, but the analysis also differs from that performed by us; thus a direct comparison is not possible. Even the observation of what SNR value the model does not work at seems to be uncommon; in the case of our research, the models stop working at a threshold of -5 dB, in the work of Michelsanti et al [57], it refers to 5 dB.…”
Section: B Results Analysismentioning
confidence: 51%
“…The ESTOI values changed dramatically from 0.442 for -20 dB to -5 dB SNR, up to 0.927 for a SNR range between 10 and 30 dB. So, the relative performance of the systems at SNR ≤ 5 dB is similar to that observed for the systems trained on a narrow SNR range [57].…”
Section: B Results Analysismentioning
confidence: 51%
See 3 more Smart Citations