2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015
DOI: 10.1109/icassp.2015.7178801
|View full text |Cite
|
Sign up to set email alerts
|

A deep neural network approach to speech bandwidth expansion

Abstract: We propose a deep neural network (DNN) approach to speech bandwidth expansion (BWE) by estimating the spectral mapping function from narrowband (4 kHz in bandwidth) to wideband (8 kHz in bandwidth). Log-spectrum power is used as the input and output features to perform the required nonlinear transformation, and DNNs are trained to realize this high-dimensional mapping function. When evaluating the proposed approach on a large-scale 10-hour test set, we found that the DNN-expanded speech signals give excellent … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
71
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 108 publications
(74 citation statements)
references
References 23 publications
1
71
0
Order By: Relevance
“…Motivated by the success of GSNs and SPNs on the task of image completion [9], [21], we use both models to complete the high frequency parts of log-spectrograms, lost due to the telephone bandpass filter. In a recent work deep neural networks (DNNs) have been proposed for ABE [36]. First, unsupervised pre-training of RBMs is performed using the log-spectrum of the narrow-band signal as input.…”
Section: B Artificial Bandwidth Extensionmentioning
confidence: 99%
See 1 more Smart Citation
“…Motivated by the success of GSNs and SPNs on the task of image completion [9], [21], we use both models to complete the high frequency parts of log-spectrograms, lost due to the telephone bandpass filter. In a recent work deep neural networks (DNNs) have been proposed for ABE [36]. First, unsupervised pre-training of RBMs is performed using the log-spectrum of the narrow-band signal as input.…”
Section: B Artificial Bandwidth Extensionmentioning
confidence: 99%
“…We use popular generative models from representation learning including Gauss Bernoulli restricted Boltzmann machines (GBRBMs) [13], conditional Gauss Bernoulli restricted Boltzmann machines (CGBRBMs) [32], higher order contractive autoencoders (HCAEs) [16], SPNs, and GSNs. Furthermore, a rectifier MLP is applied to both tasks to facilitate a comparison of the results in this work to [33], [34], [35], [36].…”
Section: Introductionmentioning
confidence: 99%
“…It also avoids a brute-force approach that always selects the same number of frames for each microphone, which may eventually leads to a too big DNN input size as the number of microphone increases and thus causes dramatic performance deterioration [28]. The proposed approach has a low computational complexity, since only a single DNN is used.…”
Section: Dnnspatialmentioning
confidence: 99%
“…In order to be compatible with existing narrow-band speech signals and find the losing high frequency signals, high-frequency bandwidth extension (BWE) is applied to extend the narrow band speech signals to wide band. The method is utilizing the features of low frequency signals to predict the ones of high frequency signals based on statistical correlation between high and low frequency signals, so as to reconstruct the lost high frequency signals [2][3][4][5][6]. And it needs no side information, is called -blind‖ bandwidth extension.…”
Section: Introductionmentioning
confidence: 99%