Abstract:Acoustic maps created on the basis of the signals acquired by distributed networks of microphones allow to identify position and orientation of an active talker in an enclosure. In adverse situations of high background noise, high reverberation or unavailability of direct paths to the microphones, localization may fail. This paper proposes a novel approach to talker localization and estimation of head orientation based on the classification of Global Coherence Field (GCF) or Oriented GCF maps. Preliminary expe… Show more
“…Six areas were randomly chosen inside the room, avoiding border areas. ͑Areas 1, 3,5,11,22,23,29,37,43,45,47, and 49 are border areas in Fig. 7.…”
Section: Best Array Selection By Individual Criteriamentioning
confidence: 98%
“…Using the same standard metrics of the SRP-PHAT orientation method, i.e., the average error, the ability of the system to correctly classify the source orientation within eight classes separated by 45°, and assuming correct classification error of Ϯ1 adjacent class, 51°, 30%, and 68% were obtained, respectively. Brutti et al 19,23 extended the GCF position localization method to consider the source orientation. The new method was named the oriented global coherence field.…”
Section: Introductionmentioning
confidence: 99%
“…The source orientation [19][20][21][22][23] also plays an important role in acoustic localization because a directional source does not radiate uniformly in all directions, and the quality of signals recorded by distant microphones is affected not only by environmental noise and reverberation but also by the speaker's relative orientation. 21 Sachar et al 20 proposed the energy method, where differences in the source radiation pattern can be detected and used to predict the source orientation.…”
A method which automatically provides the position and orientation of a directional acoustic source in an enclosed environment is proposed. In this method, different combinations of the estimated parameters from the received signals and the microphone positions of each array are used as inputs to the artificial neural network (ANN). The estimated parameters are composed of time delay estimates (TDEs), source position estimates, distance estimates, and energy features. The outputs of the ANN are the source orientation (one out of four possible orientations shifted by 90 degrees and either the best array which is defined as the nearest to the source) or the source position in two dimensional/three dimensional (2D/3D) space. This paper studies the position and orientation estimation performances of the ANN for different input/output combinations (and different numbers of hidden units). The best combination of parameters (TDEs and microphone positions) yields 21.8% reduction in the average position error compared to the following baselines and a correct orientation ratio greater than 99%. Position localization baselines consist of a time delay of arrival based method with an average position error of 34.1 cm and the steered response power with phase transform method with an average position error of 29.8 cm in 3D space.
“…Six areas were randomly chosen inside the room, avoiding border areas. ͑Areas 1, 3,5,11,22,23,29,37,43,45,47, and 49 are border areas in Fig. 7.…”
Section: Best Array Selection By Individual Criteriamentioning
confidence: 98%
“…Using the same standard metrics of the SRP-PHAT orientation method, i.e., the average error, the ability of the system to correctly classify the source orientation within eight classes separated by 45°, and assuming correct classification error of Ϯ1 adjacent class, 51°, 30%, and 68% were obtained, respectively. Brutti et al 19,23 extended the GCF position localization method to consider the source orientation. The new method was named the oriented global coherence field.…”
Section: Introductionmentioning
confidence: 99%
“…The source orientation [19][20][21][22][23] also plays an important role in acoustic localization because a directional source does not radiate uniformly in all directions, and the quality of signals recorded by distant microphones is affected not only by environmental noise and reverberation but also by the speaker's relative orientation. 21 Sachar et al 20 proposed the energy method, where differences in the source radiation pattern can be detected and used to predict the source orientation.…”
A method which automatically provides the position and orientation of a directional acoustic source in an enclosed environment is proposed. In this method, different combinations of the estimated parameters from the received signals and the microphone positions of each array are used as inputs to the artificial neural network (ANN). The estimated parameters are composed of time delay estimates (TDEs), source position estimates, distance estimates, and energy features. The outputs of the ANN are the source orientation (one out of four possible orientations shifted by 90 degrees and either the best array which is defined as the nearest to the source) or the source position in two dimensional/three dimensional (2D/3D) space. This paper studies the position and orientation estimation performances of the ANN for different input/output combinations (and different numbers of hidden units). The best combination of parameters (TDEs and microphone positions) yields 21.8% reduction in the average position error compared to the following baselines and a correct orientation ratio greater than 99%. Position localization baselines consist of a time delay of arrival based method with an average position error of 34.1 cm and the steered response power with phase transform method with an average position error of 29.8 cm in 3D space.
“…Some researchers have focused on using distributed arrays in SSL and defining some spatial related factors to the localization algorithm [29,30,31]. Distributed networks of microphones have also been used to create acoustic maps based on the classification of a global coherence field or oriented global coherence field to identify position and orientation of a speaker [32]. The near-and far-field arrays of such a distributed network is utilized in the European Commission integrated project CHIL, "Computers in the Human Interaction Loop" project, to solve the problems of speaker localization and tracking, speech activity detection and distant-talking automatic speech recognition [33].…”
Sound source localization with microphone arrays has received considerable attention as a means for the automated tracking of individuals in an enclosed space and as a necessary component of any general-purpose speech capture and automated camerapointing system. A novel computationally efficient method compared to traditional source localization techniques is proposed and is both theoretically and experimentally investigated in this research.
“…Although more computationally expensive, it was shown that they provide reliable results. In particular, when even the maximization of the "global coherence" fails, a suitable analysis and classification of the spatial map yields useful information to localize a speaker and determine his/her head orientation [8].…”
An interface for distant-talking control of home devices requires the possibility of identifying the positions of multiple users. Acoustic maps, based either on Global Coherence Field (GCF) or Oriented Global Coherence Field (OGCF), have already been exploited successfully to determine position and head orientation of a single speaker. This paper proposes a new method using acoustic maps to deal with the case of two simultaneous speakers. The method is based on a two step analysis of a coherence map: first the dominant speaker is localized; then the map is modified by compensating for the effects due to the first speaker and the position of the second speaker is detected. Simulations were carried out to show how an appropriate analysis of OGCF and GCF maps allows one to localize both speakers. Experiments proved the effectiveness of the proposed solution in a linear microphone array set up.Index Terms-microphone array, speaker localization, multiple speakers, global coherence field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.