Abstract-Consider a nonparametric representation of acoustic wave fields that consists of observing the sound pressure along a straight line or a smooth contour defined in space. The observed data contains implicit information of the surrounding acoustic scene, both in terms of spatial arrangement of the sources and their respective temporal evolution. We show that such data can be effectively analyzed and processed in what we call the space-time-frequency representation space, consisting of a Gabor representation across the spatio-temporal manifold defined by the spatial axis and the temporal axis . In the presence of a source, the spectral patterns generated at have a characteristic triangular shape that changes according to certain parameters, such as the source distance and direction, the number of sources, the concavity of , and the analysis window size. Yet, in general, the wave fronts can be expressed as a function of elementary directional components-most notably, plane waves and far-field components. Furthermore, we address the problem of processing the wave field in discrete space and time, i.e., sampled along and , where a Gabor representation implies that the wave fronts are processed in a block-wise fashion. The key challenge is how to chose and customize a spatio-temporal filter bank such that it exploits the physical properties of the wave field while satisfying strict requirements such as perfect reconstruction, critical sampling, and computational efficiency. We discuss the architecture of such filter banks, and demonstrate their applicability in the context of real applications, such as spatial filtering, deconvolution, and wave field coding.
We present a new method for compressing spatio-temporal audio data for reproduction through Wave Field Synthesis. The data is obtained by sampling the sound field in space at equally-spaced points on a straight line, and transformed into the frequency domain using a spatio-temporal lapped transform. The two-dimensional spectrum is quantized using a psychoacoustic model derived for spatio-temporal frequencies, which estimates the maximum quantization noise power that each frequency can support in order to preserve transparency in the decoded signal. On the decoder side, the inverse lapped transform recovers the spatio-temporal data. In our experimental results, we verified that the bitrate-efficiency can be improved by increasing either the spatial sampling frequency or the spatial resolution of the lapped transform.
We revisit the topics of near-field adaptive beamforming and source localization following an alternative approach based on a spatiotemporal spectral representation of the acoustic wave field. With the proposed method, the wave field is expressed as a separable combination of the signal and spatial components that characterize the various sources in the acoustic scene. This allows beamforming operations such as beam steering and sidelobe canceling to be translated into a two-dimensional (2D) sampling problem, where the sampling kernels are derived according to a parametric model representing the 2D spectral pattern generated in the presence of a source. Conversely, the spectral pattern can be estimated from an arbitrary input through the use of parametric spectral estimation techniques, providing a novel solution to the near-field source localization problem.
We address the problem of integrating directional analysis of sound into the filterbank of a spatial audio coder, with the purpose of processing and coding with some degree of independence the plane waves traveling in different directions. A plane wave represents an elementary waveform in the spatio-temporal analysis of the sound field, the same way a complex exponential is an elementary waveform in the time domain analysis of signals. Since a two-dimensional separable filterbank is not flexible enough for this purpose, we propose a non-separable approach based on the quincunx filterbank with diamond-shaped filters, cascaded with a base transform filterbank. This solution provides an invertible and critically sampled decomposition of the spatiotemporal spectra into subbands representing the different directions of wave propagation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.