This work proposes a novel method for 3D direction of arrival (DOA) estimation based on the sound intensity vector estimation, via the encoding of the signals of a spherical microphone array from the space domain to the spherical harmonic domain. The sound intensity vector is estimated on detected single source zones (SSZs), where one source is dominant. A smoothed 2D histogram of these estimates reveals the DOA of the present sources and through an iterative process, accurate 3D DOA information can be obtained. The performance of the proposed method is demonstrated through simulations in various signal-to-noise ratio and reverberation conditions.
Current methods for immersive playback of spatial sound content aim at flexibility in terms of encoding and decoding, abstracting the two from the recording or playback setup. Ambisonics constitutes such a method, that is however signal-independent, and at low spatial resolutions fails to provide appropriate spatialization cues to the listener, with potential severe colouration effects and localization ambiguity. We present a new signal-dependent method for parametric analysis and synthesis of ambisonic sound scenes that takes advantage of the flexibility of Ambisonics as a spatial audio format, while improving reproduction. The proposed approach considers a more general acoustic model than previous proposals, with multiple source signals and a non isotropic ambient component. According to a listening test using headphones, the method is perceived closer to binaural reference sound scenes than ambisonic playback.
This article details an investigation into the perceptual effects of different rendering strategies when synthesizing loudspeaker array room impulse responses (RIRs) using microphone array RIRs in a parametric fashion. The aim of this rendering task is to faithfully reproduce the spatial characteristics of a captured space, encoded within the input microphone array RIR (or the spherical harmonic RIR derived from it), over a loudspeaker array. For this study, a higherorder formulation of the Spatial Impulse Response Rendering (SIRR) method is introduced and subsequently employed to investigate the perceptual effects of the following rendering configurations: the spherical harmonic input order, frequency resolution, and utilizing dedicated diffuse stream rendering. Formal listening tests were conducted using a 64-channel loudspeaker array in an anechoic chamber, where simulated reference scenarios were compared against the outputs of different methods and rendering configurations. The test results indicate that dedicated diffuse stream rendering and higher analysis orders both yield noticeable perceptual improvements, particularly when employing problematic transient stimuli as input. Additionally, it was found that the frequency resolution employed during rendering has only a minor influence over the perceived accuracy of the reproduction in comparison to the other two tested attributes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.