Cătălin Zorilă scite author profile

Cătălin Zorilă

5Publications

51Citation Statements Received

99Citation Statements Given

How they've been cited

How they cite others

Affiliations

Toshiba (United Kingdom), University of Sheffield

Publications

Order By: Most citations

On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments

Zhang

Zorilă

Doddipatla

et al. 2020

View full text Add to dashboard Cite

This paper introduces a new method for multi-channel time domain speech separation in reverberant environments. A fullyconvolutional neural network structure has been used to directly separate speech from multiple microphone recordings, with no need of conventional spatial feature extraction. To reduce the influence of reverberation on spatial feature extraction, a dereverberation pre-processing method has been applied to further improve the separation performance. A spatialized version of wsj0-2mix dataset has been simulated to evaluate the proposed system. Both source separation and speech recognition performance of the separated signals have been evaluated objectively. Experiments show that the proposed fully-convolutional network improves the source separation metric and the word error rate (WER) by more than 13% and 50% relative, respectively, over a reference system with conventional features. Applying dereverberation as pre-processing to the proposed system can further reduce the WER by 29% relative using an acoustic model trained on clean and reverberated data.

show abstract

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for Chime-5 Dinner Party Transcription

Zorilă

Boeddeker

Doddipatla

et al. 2019

View full text Add to dashboard Cite

Despite the strong modeling power of neural network acoustic models, speech enhancement has been shown to deliver additional word error rate improvements if multi-channel data is available. However, there has been a longstanding debate whether enhancement should also be carried out on the ASR training data. In an extensive experimental evaluation on the acoustically very challenging CHiME-5 dinner party data we show that: (i) cleaning up the training data can lead to substantial error rate reductions, and (ii) enhancement in training is advisable as long as enhancement in test is at least as strong as in training. This approach stands in contrast and delivers larger gains than the common strategy reported in the literature to augment the training database with additional artificially degraded speech. Together with an acoustic model topology consisting of initial CNN layers followed by factorized TDNN layers we achieve with 41.6 % and 43.2 % WER on the DEV and EVAL test sets, respectively, a new single-system state-of-the-art result on the CHiME-5 data. This is a 8 % relative improvement compared to the best word error rate published so far for a speech recognizer without system combination.

show abstract

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription

Zorilă¹,

Boeddeker²,

Doddipatla³

et al. 2019

Preprint

View full text Add to dashboard Cite

Transformer-based Streaming ASR with Cumulative Attention

Li¹,

Zhang²,

Zorilă³

et al. 2022

Preprint

View full text Add to dashboard Cite

Monaural Source Separation: From Anechoic To Reverberant Environments

Cord-Landwehr

Böddeker

Neumann

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.