Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding
EVS, the newly standardized 3GPP Codec for Enhanced Voice Services (EVS) was developed for mobile services such as VoLTE, where error resilience is highly essential. The presented paper outlines all aspects of the advances brought during the EVS development on packet loss concealment, by presenting a high level description of all technical features present in the final standardized codec. Coupled with jitter buffer management, the EVS codec provides robustness against late or lost packets. The advantages of the new EVS codec over reference codecs are further discussed based on listening test results
This paper describes new time domain techniques for concealing packet loss in the new 3GPP Enhanced Voice Services codec. Enhancements to the existing ACELP concealment methods include guided, improved pitch prediction, increased flexibility and accuracy of pulse resynchronization. Furthermore, the new method of separate linear predictive (LP) filter synthesis aims for sound quality improvement in case of multiple packet loss, especially for noisy signals. Another enhancement consists of a guided LP concealment approach to limit the risk of creating artifacts during recovery. These enhancements are also used in the presented advanced TCX concealment method. Subjective listening tests show that quality is significantly increased with these methods
Speech intelligibility is an important aspect of speech transmission but often only the quality is evaluated using perceptual tests when speech coding standards are compared. In this study, the performance of three wideband speech coding standards, adaptive multi-rate wideband (AMR-WB), G.718, and enhanced voice services (EVS), is evaluated in a subjective intelligibility test. The test covers different packet loss conditions as well as a near-end background noise condition. Additionally, an objective quality evaluation in different packet loss conditions is conducted. All of the test conditions extend beyond the specification range to evaluate the attainable performance of the codecs in extreme conditions. The results of the subjective tests show that both EVS and G.718 are better in terms of intelligibility than AMR-WB. EVS attains the same performance as G.718 with lower algorithmic delay
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.