A Simple Cepstral Domain DNN Approach to Artificial Speech Bandwidth Extension

Abel, Johannes; Strake, Maximilian; Fingscheidt, Tim

doi:10.1109/icassp.2018.8462362

Cited by 11 publications

(7 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this work, a segmental signal‐to‐noise ratio (segSNR) [57], log spectral distance (LSD) [44], NB mean opinion score listening quality objective (MOS‐LQO) [57, 58], and WB MOS‐LQO [59, 60] as the objective measures are taken for examining the quality of artificially extended speech signals. Next, we convert the IIR filter

K_{opt}

into an approximate FIR filter by using the Taylor series truncation method.…”

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

“…Here, the proposed approach is compared with other existing approaches by maintaining the same experimental conditions such as LPF, HPF, the dimension of HB feature, DNN architecture (seven hidden layers and 256 neurons in each hidden layer), data set, and NB signal processing. Two recently reported current works such as the modulation technique [14] with a slight modification and a cepstral domain approach [44] are included for comparison. The gain for the modulation technique is calculated by following [15] and the cepstrum feature is used for representing the NB information as well as the HB spectral envelope information.…”

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

“…The gain for the modulation technique is calculated by following [15] and the cepstrum feature is used for representing the NB information as well as the HB spectral envelope information. The NB feature and HB feature for the cepstral domain approach contain the NB magnitude spectrum and cepstral coefficients representing the HB magnitude spectrum [44], respectively. Objective measures are arranged in Table 4 for the proposed approach and the existing methods using the same DNN model.…”

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

“…The current approach directly obtains an IIR synthesis filter, which contains needed HB signal information. Moreover, the NB signal and the estimated HB signal are added using the discrete Fourier transform (DFT) addition and gain adjustment techniques with slight modification in the current approach [14, 15, 27, 44]. The benefit of utilising the DFT addition is in removal of the leaked information in these signals from the synthesis filter and LPF.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation

Gupta

Shekhawat

2020

IET signal process.

View full text Add to dashboard Cite

This work aims to enhance the quality of narrowband (0–4 kHz) voice signal in terms of frequency components, i.e. missing high‐frequency components in a range of 4–8 kHz. The proposed artificial bandwidth extension framework uses the H∞ optimisation. In this context, a signal model is used to get a better representation of wideband (0–8 kHz) information of a signal. The H∞ optimisation is used to obtain the synthesis filter for a given signal model, which is used to synthesise the high‐band (4–8 kHz) signal. The discrete Fourier transform addition is performed to add the narrowband signal and estimated high‐band signal for removing the leaked information from the synthesis filter and non‐ideal low pass filter. Gain adjustment is performed on the estimated high‐band signal to make its energy equal to the true high‐band signal. Non‐stationary characteristics of speech signals generate an assorted variety in synthesis filters and corresponding gain. For this, a deep neural network (DNN) is used to estimate the synthesis filter and gain by using the given narrowband information. The authors analyse the performances of the DNN model on two data sets. Objective and subjective analyses are carried out on these data sets.

show abstract

K_{opt}

into an approximate FIR filter by using the Taylor series truncation method.…”

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

Section: Experiments Analysis and Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation

Gupta

Shekhawat

2020

IET signal process.

View full text Add to dashboard Cite

show abstract

“…Furthermore, DNNs have been employed for estimating the UB spectral envelope directly (regression) [25]- [28]. Opposed to the source-filter model, UB spectral magnitudes and UB phases can be estimated right away using sum-product networks (SPMs) [29], DNNs [30], [31], or recurrent neural networks (RNNs) [32], which can then be transformed back to the time domain by an overlap-add (OLA) structure. In several studies, an increased speech quality when using ABE solutions was shown [18], [33].…”

Section: Introductionmentioning

confidence: 99%

Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension

Abel

Fingscheidt

2019

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

Conventional narrowband (NB) telephony suffers from limited acoustic bandwidth at the receiver side, leading to degraded speech quality and intelligibility. In this paper, artificial speech bandwidth extension (ABE) of NB speech toward missing frequencies below about 300 Hz (low-frequency band, LB) is proposed to enhance the speech quality. The LB-ABE in this paper is employed together with a preexisting ABE toward high-frequency components to obtain spectrally balanced speech signals. In an instrumental quality assessment, the spectral distance in the LB was improved by more than 5 dB compared to NB speech. In a subjective listening test, the gap of speech quality between wideband and NB speech was significantly reduced when employing the proposed ABE toward low frequencies. The LB extension was found to further improve the preexisting ABE toward higher frequencies by a significant 0.26 CMOS points.

show abstract

A New Framework for Artificial Bandwidth Extension Using $$H^\infty $$ Filtering

Gupta

Shekhawat

Sinha

2022

Circuits Syst Signal Process

View full text Add to dashboard Cite

A Simple Cepstral Domain DNN Approach to Artificial Speech Bandwidth Extension

Cited by 11 publications

References 15 publications

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation

Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension

A New Framework for Artificial Bandwidth Extension Using $$H^\infty $$ Filtering

Contact Info

Product

Resources

About

A Simple Cepstral Domain DNN Approach to Artificial Speech Bandwidth Extension

Cited by 11 publications

References 15 publications

High‐band feature extraction for artificial bandwidth extension using deep neural network andH∞optimisation

High‐band feature extraction for artificial bandwidth extension using deep neural network andH∞optimisation

Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension

A New Framework for Artificial Bandwidth Extension Using $$H^\infty $$ Filtering

Contact Info

Product

Resources

About

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation

High‐band feature extraction for artificial bandwidth extension using deep neural network andH^∞optimisation