Srikanth Korse scite author profile

Srikanth Korse

5Publications

38Citation Statements Received

41Citation Statements Given

How they've been cited

How they cite others

Affiliations

Fraunhofer Institute for Integrated Circuits

Publications

Order By: Most citations

Enhancement of Coded Speech Using a Mask-Based Post-Filter

Korse

Gupta

Fuchs

2020

View full text Add to dashboard Cite

The quality of speech codecs deteriorates at low bitrates due to high quantization noise. A post-filter is generally employed to enhance the quality of the coded speech. In this paper, a data-driven postfilter relying on masking in the time-frequency domain is proposed. A fully connected neural network (FCNN), a convolutional encoderdecoder (CED) network and a long short-term memory (LSTM) network are implemeted to estimate a real-valued mask per timefrequency bin. The proposed models were tested on the five lowest operating modes (6.65 kbps-15.85 kbps) of the Adaptive Multi-Rate Wideband codec (AMR-WB). Both objective and subjective evaluations confirm the enhancement of the coded speech and also show the superiority of the mask-based neural network system over a conventional heuristic post-filter used in the standard like ITU-T G.718.

show abstract

Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization

Korse¹,

Jahnel

Bäckström

2016

View full text Add to dashboard Cite

Speech and audio codecs model the overall shape of the signal spectrum using envelope models. In speech coding the predominant approach is linear predictive coding, which offers high coding efficiency at the cost of computational complexity and a rigid systems design. Audio codecs are usually based on scale factor bands, whose calculation and coding is simple, but whose coding efficiency is lower than that of linear prediction. In the current work we propose an entropy coding approach for scale factor bands, with the objective of reaching the same coding efficiency as linear prediction, but simultaneously retaining a low computational complexity. The proposed method is based on quantizing the distribution of spectral mass using betadistributions. Our experiments show that the perceptual quality achieved with the proposed method is similar to that of linear predictive models with the same bit rate, while the design simultaneously allows variable bit-rate coding and can easily be scaled to different sampling rates. The algorithmic complexity of the proposed method is less than one third of traditional multi-stage vector quantization of linear predictive envelopes.

show abstract

GMM-Based Iterative Entropy Coding for Spectral Envelopes of Speech and Audio

Korse

Fuchs

Bäckström

2018

View full text Add to dashboard Cite

Spectral envelope modelling is a central part of speech and audio codecs and is traditionally based on either vector quantization or scalar quantization followed by entropy coding. To bridge the coding performance of vector quantization with the low complexity of the scalar case, we propose an iterative approach for entropy coding the spectral envelope parameters. For each parameter, a univariate probability distribution is derived from a Gaussian mixture model of the joint distribution and the previously quantized parameters used as a-priori information. Parameters are then iteratively and individually scalar quantized and entropy coded. Unlike vector quantization, the complexity of proposed method does not increase exponentially with dimension and bitrate. Moreover, the coding resolution and dimension can be adaptively modified without retraining the model. Experimental results show that these important advantages do not impair coding efficiency compared to a state-of-art vector quantization scheme.

show abstract

PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech

Korse

Pia

Gupta

et al. 2022

View full text Add to dashboard Cite

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain

Gupta¹,

Korse²,

Edler³

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Srikanth Korse

Enhancement of Coded Speech Using a Mask-Based Post-Filter

Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization

GMM-Based Iterative Entropy Coding for Spectral Envelopes of Speech and Audio

PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain

Contact Info

Product

Resources

About