ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054572
|View full text |Cite
|
Sign up to set email alerts
|

Deep Casa for Talker-independent Monaural Speech Separation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(15 citation statements)
references
References 24 publications
0
15
0
Order By: Relevance
“…In addition, a speech enhancement network is used on top of the separation model to further reduce WER. Our focus in this study is on the separation model, and we can expect further improvement by introducing speech enhancement in future work (see [25]).…”
Section: Evaluation Resultsmentioning
confidence: 99%
“…In addition, a speech enhancement network is used on top of the separation model to further reduce WER. Our focus in this study is on the separation model, and we can expect further improvement by introducing speech enhancement in future work (see [25]).…”
Section: Evaluation Resultsmentioning
confidence: 99%
“…Although there are considerably low overall performances throughout the evaluation, it is important to remember that these techniques were evaluated when running in an online manner. Many of the state-of-the-art source separation techniques (mainly based on deep learning) do not run in such a way, opting to be fed full audio recordings [ 33 , 35 ]. This lends to higher performances since information in the future of the current window can also be utilized to obtain a good separation performance.…”
Section: Discussionmentioning
confidence: 99%
“…This required a long STFT time window. This requirement increased the minimum delay of the system, which limited its applicability in real-time and low-latency applications, therefore, more and more research has begun to turn to time-domain methods [ 12 , 23 , 24 , 26 , 27 , 29 ].…”
Section: Methodsmentioning
confidence: 99%
“…With the development of big data and the improvement of computing power, deep learning achieves great success in time series signal processing such as speech recognition, speech separation [ 12 , 15 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 ], and communication signal modulation recognition [ 39 ]. These tasks demonstrate the powerful feature extraction and timing signal processing capabilities of deep learning.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation