Dual-channel speech enhancement based on traditional beamforming is difficult to effectively suppress noise. In recent years, it is promising to replace beamforming with a neural network that learns spectral characteristic. This paper proposes a neural network adaptive beamforming end-to-end dual-channel model for speech enhancement task. First, the LSTM layer is used to directly process the original speech waveform to estimate the time-domain beamforming filter coefficients of each channel and convolve and sum it with the input speech. Second, we modified a fully-convolutional time-domain audio separation network (Conv-TasNet) into a network suitable for speech enhancement which is called Denoising-TasNet to further enhance the output of the beamforming. The experimental results show that the proposed method is better than convolutional recurrent network (CRN) model and several popular noise reduction methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.