Using visual speech information and perceptually motivated loss functions for binary mask estimation

Websdale, Danny; Milner, Ben

doi:10.21437/avsp.2017-9

Cited by 2 publications

(2 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Like-wise, an alternative CASA method obtains speaker consistent T-F sections and engages speaker prototypes along omitted records methods to group them into communication streams [14]. Websdale and Milner [15] employed unverified huddling to assemble speech constituencies into dual voice assemblies through the extension of the percentage of mid and interior collection gaps. Lekshmi and Sathidevi [2] postulated non-learning-centered speech isolation methods for a specific-channel speech estrangement exploiting Short-Time Fourier Transform (STFT) [3].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Speech Segregation in Background Noise Based on Deep Learning

et al. 2020

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

“…This method was referred to as a deep Boltzmann machine (DBM). Websdale and Milner [15] suggested a technique centered on Recurrent Neural Network (RNN). Using the noisy acoustic sample, RNN can be employed for speech separation.…”

Section: Related Workmentioning

confidence: 99%

Speech Segregation in Background Noise Based on Deep Learning

et al. 2020

View full text Add to dashboard Cite

A hybrid technique for speech segregation and classification using a sophisticated deep neural network

et al. 2018

View full text Add to dashboard Cite

Recent research on speech segregation and music fingerprinting has led to improvements in speech segregation and music identification algorithms. Speech and music segregation generally involves the identification of music followed by speech segregation. However, music segregation becomes a challenging task in the presence of noise. This paper proposes a novel method of speech segregation for unlabelled stationary noisy audio signals using the deep belief network (DBN) model. The proposed method successfully segregates a music signal from noisy audio streams. A recurrent neural network (RNN)-based hidden layer segregation model is applied to remove stationary noise. Dictionary-based fisher algorithms are employed for speech classification. The proposed method is tested on three datasets (TIMIT, MIR-1K, and MusicBrainz), and the results indicate the robustness of proposed method for speech segregation. The qualitative and quantitative analysis carried out on three datasets demonstrate the efficiency of the proposed method compared to the state-of-the-art speech segregation and classification-based methods.

show abstract

Using visual speech information and perceptually motivated loss functions for binary mask estimation

Cited by 2 publications

References 26 publications

Speech Segregation in Background Noise Based on Deep Learning

Speech Segregation in Background Noise Based on Deep Learning

A hybrid technique for speech segregation and classification using a sophisticated deep neural network

Contact Info

Product

Resources

About