We propose a method to create a directional sound source in front of a linear loudspeaker array. The method creates clusters of focused sources to form multipoles by using a linear loudspeaker array and superposes the multipoles to synthesize a directivity pattern. We also derive an efficient multipole structure in which adjacent lower order multipoles are overlapped. The structure reduces the number of focused sources, thereby reducing the algorithmic complexity needed to create them. To further reduce complexity, we also derive a time domain implementation of the proposed method. To mitigate degradation in the reproduced directivity due to superposition of the inaccurate sound fields of focused sources, a fractional delay interpolation is applied. Computer simulation results indicate that the proposed method based on superposition of up to the third order multipoles creates a directional sound source at significantly lower complexity than a conventional method.
This paper describes new time domain techniques for concealing packet loss in the new 3GPP Enhanced Voice Services codec. Enhancements to the existing ACELP concealment methods include guided, improved pitch prediction, increased flexibility and accuracy of pulse resynchronization. Furthermore, the new method of separate linear predictive (LP) filter synthesis aims for sound quality improvement in case of multiple packet loss, especially for noisy signals. Another enhancement consists of a guided LP concealment approach to limit the risk of creating artifacts during recovery. These enhancements are also used in the presented advanced TCX concealment method. Subjective listening tests show that quality is significantly increased with these methods
NTT is promoting research and development to enable people with various physical conditions, including disabilities, to enjoy watching sports. Focusing on goalball, which is a type of parasport, this article introduces a new experience of sports watching that provides a sense of realism as if the spectators were watching the game at the competition venue. This experience is possible by using ultra-realistic communication technology called Kirari! (particularly, highly realistic sound-image localization technology) for producing stereophonic sound.
SUMMARYWe proposes a new adaptive spectral masking method of algebraic vector quantization (AVQ) for non-sparse signals in the modified discreet cosine transform (MDCT) domain. This paper also proposes switching the adaptive spectral masking on and off depending on whether or not the target signal is non-sparse. The switching decision is based on the results of MDCT-domain sparseness analysis. When the target signal is categorized as non-sparse, the masking level of the target MDCT coefficients is adaptively controlled using spectral envelope information. The performance of the proposed method, as a part of ITU-T G.711.1 Annex D, is evaluated in comparison with conventional AVQ. Subjective listening test results showed that the proposed method improves sound quality by more than 0.1 points on a five-point scale on average for speech, music, and mixed content, which indicates significant improvement.
key words: speech and audio coding, standardization, ITU-T G.711.1 Annex D, ITU-T G.722 Annex B, super-wideband (SWB) extension, algebraic vector quantization (AVQ)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.