Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding
shlQemt.iis.fhg.de' . i
A B S T R A C TLifting scheme based integer transforms are very powerful tools to construct lossless coding schemes. These transforms such as the Integer Fast Fourier Transform (IntFFT) and the Integer Modified Discrete Cosine Transform (Int-MDCT) are integer approximations of the original floatingpoint transforms, and hence there is an approximation error in the transform domain. This paper will propose structures for improved integer transforms in terms of improved a p proximation accuracy and computational efficiency. Experimental results will show that clear improvements in these two points are achieved in lossless audio coding.
The Modified Discrete Cosine Transfonn (MDCT) is widely used in modem perceptual audio coding schemes. In this paper we present an integer approximation of this lapped transfonn, called IntMDCT, which is derived from the MDCT using the lifting scheme. This reversible integer transfonn inherits most of the attracti ve properties of the MDCT, exhibiting a good spectral representation of the audio signal, critical sampling and overlap ping of blocks. This makes the IntMDCT well suited for both lossless audio coding as well as for combined perceptual and loss less audio coding. A scalable system is presented providing a lossless enhancement of perceptual audio coding schemes, such as MPEG-2 AAC.
Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially in the case of long transform sizes in audio coding applications, this leads to a considerable approximation error in the frequency domain. This paper presents a multidimensional lifting approach for reducing this approximation error. In this approach, large parts of the transform are calculated without rounding operations, only the output is rounded and added. The new approach is applied and evaluated for both the Integer Modified Discrete Cosine Transform (IntMDCT) and the Integer Fast Fourier Transform (IntFFT).
Recently lifting-based integer approximations of filter banks have received much attention, especially in the field of image coding. This paper focuses on the application of these techniques to cosine modulated filter banks for audio coding, including not only the Modified Discrete Cosine Transform (MDCT) hut also low delay filter banks. Applications of these integer filter banks include lossless audio coding and backward compatible lossless enhancement of MDCTbased perceptual audio coding schemes, such as MPEG-U4 AAC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.