Multimodal Technologies for Perception of Humans
DOI: 10.1007/978-3-540-69568-4_6
|View full text |Cite
|
Sign up to set email alerts
|

UPC Audio, Video and Multimodal Person Tracking Systems in the Clear Evaluation Campaign

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 7 publications
0
9
0
Order By: Relevance
“…In indoor scenarios, active speaker localization and tracking in meeting rooms can be done by using audiovisual fusion where audio localization is done using beamforming and visual target is tracked using Kalman filter [7] or by using TDOA with Particle filter [8]. Tracking using audio modality is strongly effected by reverberation.…”
Section: Related Workmentioning
confidence: 99%
“…In indoor scenarios, active speaker localization and tracking in meeting rooms can be done by using audiovisual fusion where audio localization is done using beamforming and visual target is tracked using Kalman filter [7] or by using TDOA with Particle filter [8]. Tracking using audio modality is strongly effected by reverberation.…”
Section: Related Workmentioning
confidence: 99%
“…The problem of audiovisual tracking involves the estimation of the arrival angle of the audio signal, video detection, filtering and smoothing of the two modalities, fusion and [36] Camera and 2 microphones TDNN Surveillance [1], [11], [15], [32] PF Surveillance and teleconferencing [12], [13], [37] KF, DKF Smart rooms [38] Multiple cameras and microphone arrays LDA Smart rooms [9], [30], [31], [39] PF Meeting rooms finally joint state estimation. Let the target state be defined as y(t) = (x, y, w, h, H), where (x, y) is the center of the ellipse approximating the object shape, (w, h) are the width and height of the bounding box and H is the color histogram of the object.…”
Section: Audiovisual Trackingmentioning
confidence: 99%
“…Concerning the visual tracking algorithms, two main approaches have been followed by the various developed 3D tracking systems: First, a model-based approach where a 3D model of the tracked object is maintained by rendering it onto the camera views and searching for supporting evidence in each view to update its parameters [2,6,11,31,33,40]. Second, a data-driven approach where 2D trackers operate independently on the separate camera views and the 2D tracks belonging to a same target are collected into a 3D track [28,52,56].…”
Section: Person Trackingmentioning
confidence: 99%
“…On the acoustic side, approaches can be roughly categorized as follows: Approaches which rely on the computation of a more or less coarse global coherence field (GCF, or SRP-PHAT), on which the tracking of correlation peaks is performed [2,9]; particle filter approaches, which approximate the belief on speaker positions by a set of samples and measure the agreement of the observed acoustic signals (their correlation value) given each sample position hypothesis [39]; approaches that feed computed time delays of arrival (TDOAs) between microphone pairs directly as observations to a Kalman or other probabilistic tracker [24,30,50]. The best performing system was based on a Joint Probabilistic Data Association Filter (JPDAF), which keeps track of a number of sound sources, including noise sources, resolving data associations and position updates jointly for all tracks.…”
Section: Person Trackingmentioning
confidence: 99%