Máximo Cobos scite author profile

Institute of Electrical and Electronics Engineers (IEEE)CobosAbstract-The Steered Response Power -Phase Transform (SRP-PHAT) algorithm has been shown to be one of the most robust sound source localization approaches operating in noisy and reverberant environments. However, its practical implementation is usually based on a costly fine grid-search procedure, making the computational cost of the method a real issue. In this paper, we introduce an effective strategy that extends the conventional SRP-PHAT functional with the aim of considering the volume surrounding the discrete locations of the spatial grid. As a result, the modified functional performs a full exploration of the sampled space rather than computing the SRP at discrete spatial positions, increasing its robustness and allowing for a coarser spatial grid. To this end, the Generalized Cross-Correlation (GCC) function corresponding to each microphone pair must be properly accumulated according to the defined microphone setup. Experiments carried out under different acoustic conditions confirm the validity of the proposed approach.Index Terms-sound source localization, SRP-PHAT, microphone array.

show abstract

A Survey of Sound Source Localization Methods in Wireless Acoustic Sensor Networks

Cobos

Antonacci

Alexandridis

et al. 2017

Wireless Communications and Mobile Computing

104

View full text Add to dashboard Cite

Wireless acoustic sensor networks (WASNs) are formed by a distributed group of acoustic-sensing devices featuring audio playing and recording capabilities. Current mobile computing platforms offer great possibilities for the design of audio-related applications involving acoustic-sensing nodes. In this context, acoustic source localization is one of the application domains that have attracted the most attention of the research community along the last decades. In general terms, the localization of acoustic sources can be achieved by studying energy and temporal and/or directional features from the incoming sound at different microphones and using a suitable model that relates those features with the spatial location of the source (or sources) of interest. This paper reviews common approaches for source localization in WASNs that are focused on different types of acoustic features, namely, the energy of the incoming signals, their time of arrival (TOA) or time difference of arrival (TDOA), the direction of arrival (DOA), and the steered response power (SRP) resulting from combining multiple microphone signals. Additionally, we discuss methods not only aimed at localizing acoustic sources but also designed to locate the nodes themselves in the network. Finally, we discuss current challenges and frontiers in this field.

show abstract

Low-Cost Alternatives for Urban Noise Nuisance Monitoring Using Wireless Sensor Networks

et al. 2015

View full text Add to dashboard Cite

Noise pollution caused by vehicular traffic is a common problem in urban environments that has been shown to affect people's health and children's cognition. In the last decade, several studies have been conducted to assess this noise, by measuring the equivalent noise pressure level (called Leq) to acquite an accurate sound map using wireless networks with acoustic sensors. However, even with similar values of Leq, people can feel the noise differently according to its frequency characteristics. Thus, indexes which can express people's feelings by subjective measures are required. In this paper we analyze the suitability of using the psycho-acoustic metrics given by the Zwicker's model, instead of just only considering Leq. The goal is to evaluate the hardware limitations of a low-cost wireless acoustic sensor network that is used to measure the annoyance, using two types of commercial and off-the shelf sensor nodes, Tmote-Invent nodes and Raspberry Pi platforms. Moreover, to calculate the parameters using these platforms, different simplifications to the Zwicker's model based on the specific features of road traffic noise are proposed. To validate the different alternatives, the aforementioned nodes are tested in a traffic congested area of Valencia City in a vertical and horizontal network deployment. Based on the results, it is observed that the Raspberry Pi platforms are a feasible low-cost alternative to increase the spatial-temporal resolution, while Tmote-Invent nodes do not confirm their suitablity due to their limited memory and calibration issues.

show abstract

A steered response power iterative method for high-accuracy acoustic source localization

Martí

Cobos

Lopez

et al. 2013

View full text Add to dashboard Cite

Source localization using the steered response power (SRP) usually requires a costly grid-search procedure. To address this issue, a modified SRP algorithm was recently introduced, providing improved robustness when using coarser spatial grids. In this letter, an iterative method based on the modified SRP is presented. A coarse spatial grid is initially evaluated with the modified SRP, selecting the point with the highest accumulated value. Then, its corresponding volume is iteratively decomposed by using a finer spatial grid. Experiments have shown that this method provides almost the same accuracy as the fine-grid search with a substantial reduction of functional evaluations.

show abstract

Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings

Carabias-Orti

Cobos

Rodriguez-Serrano

2013

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

Close-microphone techniques are extensively employed in many live music recordings, allowing for interference rejection and reducing the amount of reverberation in the resulting instrument tracks. However, despite the use of directional microphones, the recorded tracks are not completely free from source interference, a problem which is commonly known as microphone leakage. While source separation methods are potentially a solution to this problem, few approaches take into account the huge amount of prior information available in this scenario. In fact, besides the special properties of close-microphone tracks, the knowledge on the number and type of instruments making up the mixture can also be successfully exploited for improved separation performance. In this paper, a nonnegative matrix factorization (NMF) method making use of all the above information is proposed. To this end, a set of instrument models are learnt from a training database and incorporated into a multichannel extension of the NMF algorithm. Several options to initialize the algorithm are suggested, exploring their performance in multiple music tracks and comparing the results to other state-of-the-art approaches.

show abstract

A Sparsity-Based Approach to 3D Binaural Sound Synthesis Using Time-Frequency Array Processing

Cobos

Lopez

Spors

2010

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

Localization of sounds in physical space plays a very important role in multiple audio-related disciplines, such as music, telecommunications, and audiovisual productions. Binaural recording is the most commonly used method to provide an immersive sound experience by means of headphone reproduction. However, it requires a very specific recording setup using high-fidelity microphones mounted in a dummy head. In this paper, we present a novel processing framework for binaural sound recording and reproduction that avoids the use of dummy heads, which is specially suitable for immersive teleconferencing applications. The method is based on a time-frequency analysis of the spatial properties of the sound picked up by a simple tetrahedral microphone array, assuming source sparseness. The experiments carried out using simulations and a real-time prototype confirm the validity of the proposed approach.

show abstract

Simultaneous Ranging and Self-Positioning in Unsynchronized Wireless Acoustic Sensor Networks

Cobos

Pérez-Solano

Belmonte

et al. 2016

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

Abstract-Automatic ranging and self-positioning is a very desirable property in wireless acoustic sensor networks (WASNs) where nodes have at least one microphone and one loudspeaker. However, due to environmental noise, interference and multipath effects, audio-based ranging is a challenging task. This paper presents a fast ranging and positioning strategy that makes use of the correlation properties of pseudo-noise (PN) sequences for estimating simultaneously relative time-of-arrivals (TOAs) from multiple acoustic nodes. To this end, a proper test signal design adapted to the acoustic node transducers is proposed. In addition, a novel self-interference reduction method and a peak matching algorithm are introduced, allowing for increased accuracy in indoor environments. Synchronization issues are removed by following a BeepBeep strategy, providing range estimates that are converted to absolute node positions by means of multidimensional scaling (MDS). The proposed approach is evaluated both with simulated and real experiments under different acoustical conditions. The results using a real network of smartphones and laptops confirm the validity of the proposed approach, reaching an average ranging accuracy below 1 centimeter.

show abstract

Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach

Cobos

Antonacci

Comanducci

et al. 2020

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, where the SRP functional can be written as an accumulation of the GCCs computed from multiple sensor pairs. Unfortunately, the accuracy of TDOA estimates is affected by multiple factors, including noise, reverberation and signal bandwidth. In this paper, a sub-band approach for time delay estimation aimed at improving the performance of the conventional GCC is presented. The proposed method is based on the extraction of multiple GCCs corresponding to different frequency bands of the cross-power spectrum phase in a sliding-window fashion. The major contributions of this paper include: 1) a sub-band GCC representation of the cross-power spectrum phase that, despite having a reduced temporal resolution, provides a more suitable representation for estimating the true TDOA; 2) such matrix representation is shown to be rank one in the ideal noiseless case, a property that is exploited in more adverse scenarios to obtain a more robust and accurate GCC; 3) we propose a set of low-rank approximation alternatives for processing the sub-band GCC matrix, leading to better TDOA estimates and source localization performance. An extensive set of experiments is presented to demonstrate the validity of the proposed approach.Index Terms-Time delay estimation, GCC, SVD, weighted SVD, sub-band processing, SRP-PHAT. M. Cobos is with the

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Máximo Cobos

A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling

A Survey of Sound Source Localization Methods in Wireless Acoustic Sensor Networks

Low-Cost Alternatives for Urban Noise Nuisance Monitoring Using Wireless Sensor Networks

A steered response power iterative method for high-accuracy acoustic source localization

Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings

A Sparsity-Based Approach to 3D Binaural Sound Synthesis Using Time-Frequency Array Processing

Simultaneous Ranging and Self-Positioning in Unsynchronized Wireless Acoustic Sensor Networks

Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach

Contact Info

Product

Resources

About