ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682713
|View full text |Cite
|
Sign up to set email alerts
|

Effects of Lombard Reflex on the Performance of Deep-learning-based Audio-visual Speech Enhancement Systems

Abstract: Humans tend to change their way of speaking when they are immersed in a noisy environment, a reflex known as Lombard effect. Current speech enhancement systems based on deep learning do not usually take into account this change in the speaking style, because they are trained with neutral (non-Lombard) speech utterances recorded under quiet conditions to which noise is artificially added. In this paper, we investigate the effects that the Lombard reflex has on the performance of audio-visual speech enhancement … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
5
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 31 publications
1
5
0
Order By: Relevance
“…With the objective of providing a more extensive analysis of the impact of Lombard effect on deep-learning-based SE systems, the present work extends a preliminary study (Michelsanti et al, 2019a), providing the following novel contributions. First, new experiments are conducted, where deep-learning-based SE systems trained with Lombard or non-Lombard speech are evaluated on Lombard speech using a cross-validation setting to avoid that a potential intraspeaker variability of the adopted dataset leads to biased conclusions.…”
Section: Introductionmentioning
confidence: 86%
See 3 more Smart Citations
“…With the objective of providing a more extensive analysis of the impact of Lombard effect on deep-learning-based SE systems, the present work extends a preliminary study (Michelsanti et al, 2019a), providing the following novel contributions. First, new experiments are conducted, where deep-learning-based SE systems trained with Lombard or non-Lombard speech are evaluated on Lombard speech using a cross-validation setting to avoid that a potential intraspeaker variability of the adopted dataset leads to biased conclusions.…”
Section: Introductionmentioning
confidence: 86%
“…In this study, we train and evaluate systems that perform spectral SE using deep learning, as illustrated in Figure 1. The processing pipeline is inspired by Gabbay et al (2018) and the same as the one used in (Michelsanti et al, 2019a).…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The same conclusion was also reached when an audio-visual speech recognition system was used. Finally, it has recently been shown that the mismatch between plain and Lombard speech can also affect the performance of audio-visual speech enhancement models [17].…”
Section: Introductionmentioning
confidence: 99%