Fourth IEEE Workshop on Sensor Array and Multichannel Processing, 2006.
DOI: 10.1109/sam.2006.1706216
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time Implementation of a Distributed Voice Activity Detector

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 5 publications
0
10
0
Order By: Relevance
“…To decide whether to use (24) or (25), the condition number of the matrix A k does not necessarily have to be known. In principle, it is always better to replace the K − 1 auxiliary channels in x k as in formula (24), where a different q should be chosen for every p. Indeed, since microphones of different nodes are typically far apart from each other, better conditioned steering matrices are then obtained.…”
Section: Robust Danse (R-danse)mentioning
confidence: 99%
See 1 more Smart Citation
“…To decide whether to use (24) or (25), the condition number of the matrix A k does not necessarily have to be known. In principle, it is always better to replace the K − 1 auxiliary channels in x k as in formula (24), where a different q should be chosen for every p. Indeed, since microphones of different nodes are typically far apart from each other, better conditioned steering matrices are then obtained.…”
Section: Robust Danse (R-danse)mentioning
confidence: 99%
“…For example, formula (24) defines the arc (w kk (p),w qq (l)). A vertex v that has no departing arc is referred to as a direct estimation filter (DEF), that is, the signal to be estimated is the desired speech component in one of the node's own microphone signals, as in formula (25). To illustrate this, a possible graph is shown in Figure 5 for DANSE 2 applied to the scenario described in Section 3, where the hearing aid users are now listening to two speakers, that is, speakers B and C. Since the microphone signals of node 1 have a low SNR, the two desired signals in x 1 that are used in the computation of W 11 are replaced by the filtered desired speech component in the received signals from higher SNR nodes 2 and 4, that is, w 22 (1) H x 2 and w 44 (1) H x 4 , respectively.…”
Section: Convergence Of R-dansementioning
confidence: 99%
“…Each device could in theory run its own noise attenuation, source separation, voice activity detection, keyword spotting and speech recognition algorithms and more, but those algorithms would then be redundant, since multiple devices run the same tasks. For improved efficiency and accuracy, we can instead use, for example, a distributed voice activity detector among local devices [26], distributed beamforming [16] or distributed speech recognition [27].…”
Section: Local Collaborationmentioning
confidence: 99%
“…We have found a clear relation between the log-likelihood for a certainÛ and the localization error. Thus, we propose using expression (8) as the fitness function. Figure 4 shows the relation between the log-likelihood and the pairwise node distance error for all possible values ofÛ in a network with = 4.…”
Section: Uncertainty Solutionmentioning
confidence: 99%
“…The relative location of the nodes provides us with sufficient spatial information to implement a wireless microphone array (WMA). WMAs have many potential applications in distributed audio processing, such as speech enhancement [4], blind source separation and echo cancelation [5], speaker localization and tracking [6,7], and voice activity detection [8].…”
Section: Introductionmentioning
confidence: 99%