Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing 2017
DOI: 10.1145/3095713.3095732
|View full text |Cite
|
Sign up to set email alerts
|

Towards large scale multimedia indexing

Abstract: The rapid growth of multimedia databases and the human interest in their peers make indices representing the location and identity of people in audio-visual documents essential for searching archives. Person discovery in the absence of prior identity knowledge requires accurate association of audio-visual cues and detected names. To this end, we present 3 different strategies to approach this problem: clustering-based naming, verification-based naming, and graph-based naming. Each of these strategies utilizes … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
3
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 19 publications
(23 reference statements)
0
3
0
Order By: Relevance
“…SpeakerID is an important task for automatic organization of dialogue contents such as TV programs, radio podcasts, and online meetings and has gained significant research efforts [4,5,6,7,8]. Most previous work on SpeakerID approaches the task via multi-model setting, where the input to the models involve both videos/images and transcripts of the dialogues.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…SpeakerID is an important task for automatic organization of dialogue contents such as TV programs, radio podcasts, and online meetings and has gained significant research efforts [4,5,6,7,8]. Most previous work on SpeakerID approaches the task via multi-model setting, where the input to the models involve both videos/images and transcripts of the dialogues.…”
Section: Related Workmentioning
confidence: 99%
“…This process is crucial for enhancing the accessibility and searchability of multimedia content, enabling users to find segments featuring specific speakers. As a result, the development of effective SpeakerID systems has attracted significant research efforts, as evidenced by a body of work [4,5,6,7,8], striving to overcome the challenges associated with this complex task.…”
Section: Introductionmentioning
confidence: 99%
“…Other face recognition methods (even those formulated in a Bayesian framework) [33,34,9,28,18], often limit themselves to point estimates of parameters and predictions, occasionally including ad-hoc confidence metrics. A distinct advantage of our approach is that it is probabilistic end-to-end, and thus naturally provides predictions with principled, quantifiable uncertainties.…”
Section: Related Workmentioning
confidence: 99%
“…Typical application domains involve the annotation of personal photo galleries [33,34,3,13], multimedia (e.g. TV) [28,18] or security/surveillance [17]. Our work focuses on egocentric human-like face recognition, a setting which seems largely unexplored, as most of the work using first-person footage appears to revolve around other tasks like object and activity recognition, face detection, and tracking [4].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation