Ralf Geiger scite author profile

Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding

Audio Data Hiding with High Data Rates Based on Intmdct

Yokotani

2006

Low Delay Filterbanks for Enhanced Low Delay Audio Coding

Schnell

Schmidt

et al. 2007

Improved integer transforms for lossless audio coding

Yokotani

shlQemt.iis.fhg.de' . i A B S T R A C TLifting scheme based integer transforms are very powerful tools to construct lossless coding schemes. These transforms such as the Integer Fast Fourier Transform (IntFFT) and the Integer Modified Discrete Cosine Transform (Int-MDCT) are integer approximations of the original floatingpoint transforms, and hence there is an approximation error in the transform domain. This paper will propose structures for improved integer transforms in terms of improved a p proximation accuracy and computational efficiency. Experimental results will show that clear improvements in these two points are achieved in lossless audio coding.

IntMDCT - A link between perceptual and lossless audio coding

Herre

Koller

et al. 2002

The Modified Discrete Cosine Transfonn (MDCT) is widely used in modem perceptual audio coding schemes. In this paper we present an integer approximation of this lapped transfonn, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transfonn inherits most of the attracti ve properties of the MDCT, exhibiting a good spectral representation of the audio signal, critical sampling and overlap ping of blocks. This makes the IntMDCT well suited for both lossless audio coding as well as for combined perceptual and loss less audio coding. A scalable system is presented providing a lossless enhancement of perceptual audio coding schemes, such as MPEG-2 AAC.

Improved integer transforms using multi-dimensional lifting [audio coding examples]

Yokotani²,

Schuller³

et al.

Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially in the case of long transform sizes in audio coding applications, this leads to a considerable approximation error in the frequency domain. This paper presents a multidimensional lifting approach for reducing this approximation error. In this approach, large parts of the transform are calculated without rounding operations, only the output is rounded and added. The new approach is applied and evaluated for both the Integer Modified Discrete Cosine Transform (IntMDCT) and the Integer Fast Fourier Transform (IntFFT).

Integer low delay and MDCT filter banks

2002

Recently lifting-based integer approximations of filter banks have received much attention, especially in the field of image coding. This paper focuses on the application of these techniques to cosine modulated filter banks for audio coding, including not only the Modified Discrete Cosine Transform (MDCT) hut also low delay filter banks. Applications of these integer filter banks include lossless audio coding and backward compatible lossless enhancement of MDCTbased perceptual audio coding schemes, such as MPEG-U4 AAC.

Fine grain scalable perceptual and lossless audio coding based on IntMDCT

Sporer

2003