A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean

Abel, Johannes; Kaniewska, Magdalena; Guillaume, Cyril; Tirry, Wouter; Pulakka, Hannu; Myllylä, Ville; Sjöberg, Jari; Alku, Paavo; Katsir, Itai; Malah, D.; Cohen, Israel; Turan, M. A. Tugtekin; Erzin, Engin; Schlien, Thomas; Vary, Peter; Nour-Eldin, AmrH.; Kabal, P.; Fingscheidt, Tim

doi:10.1109/icassp.2016.7472812

Cited by 11 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most of the formant frequencies are still present in the AMR condition, however, with a missing LB, a spectral imbalance towards high frequencies results, which consequently affects female speakers stronger than male speakers. Of course, UB-ABE improves speech quality [28], [33], however, it does not sufficiently restore spectral balance over sounds, especially for female speakers. In [12] it was already stated, that only the simultaneous extension towards high and low frequencies leads to the maximum improvement possible, rather than the exclusive use of only one of the techniques.…”

Section: Subjective Assessmentmentioning

confidence: 95%

“…Opposed to the source-filter model, UB spectral magnitudes and UB phases can be estimated right away using sum-product networks (SPMs) [29], DNNs [30], [31], or recurrent neural networks (RNNs) [32], which can then be transformed back to the time domain by an overlap-add (OLA) structure. In several studies, an increased speech quality when using ABE solutions was shown [18], [33].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension

Abel

Fingscheidt

2019

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

Conventional narrowband (NB) telephony suffers from limited acoustic bandwidth at the receiver side, leading to degraded speech quality and intelligibility. In this paper, artificial speech bandwidth extension (ABE) of NB speech toward missing frequencies below about 300 Hz (low-frequency band, LB) is proposed to enhance the speech quality. The LB-ABE in this paper is employed together with a preexisting ABE toward high-frequency components to obtain spectrally balanced speech signals. In an instrumental quality assessment, the spectral distance in the LB was improved by more than 5 dB compared to NB speech. In a subjective listening test, the gap of speech quality between wideband and NB speech was significantly reduced when employing the proposed ABE toward low frequencies. The LB extension was found to further improve the preexisting ABE toward higher frequencies by a significant 0.26 CMOS points.

show abstract

Section: Subjective Assessmentmentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension

Abel

Fingscheidt

2019

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Early works used signal processing methods such as a source-filter model [23], [24], nonlinear devices [19], or spectral band replication [25]. Other approaches were based on data-driven techniques, such as Gaussian mixture models [26], [27], hidden Markov models [28], or shallow neural networks [29], [30].…”

Section: A Audio Bandwidth Extensionmentioning

confidence: 99%

“…Early works in audio bandwidth extension focused on speech signals and employed diverse signal processing methods, including source-filter models [13], [14], and codebook mapping [15]. The first attempts at music audio bandwidth extension used nonlinear devices [16] and spectral band replication [17].…”

Section: A Audio Bandwidth Extension and Super-resolutionmentioning

confidence: 99%

BEHM-GAN: Bandwidth Extension of Historical Music Using Generative Adversarial Networks

Moliner

Välimäki

2023

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Audio bandwidth extension aims to expand the spectrum of bandlimited audio signals. Although this topic has been broadly studied during recent years, the particular problem of extending the bandwidth of historical music recordings remains an open challenge. This paper proposes a method for the bandwidth extension of historical music using generative adversarial networks (BEHM-GAN) as a practical solution to this problem. The proposed method works with the complex spectrogram representation of audio and, thanks to a dedicated regularization strategy, can effectively extend the bandwidth of out-of-distribution real historical recordings. The BEHM-GAN is designed to be applied as a second step after denoising the recording to suppress any additive disturbances, such as clicks and background noise. We train and evaluate the method using solo piano classical music. The proposed method outperforms the compared baselines in both objective and subjective experiments. The results of a formal blind listening test show that BEHM-GAN significantly increases the perceptual sound quality in early-20th-century gramophone recordings. For several items, there is a substantial improvement in the mean opinion score after enhancing historical recordings with the proposed bandwidthextension algorithm. This study represents a relevant step toward data-driven music restoration in real-world scenarios.

show abstract