2004
DOI: 10.1007/978-3-540-27814-6_34
|View full text |Cite
|
Sign up to set email alerts
|

Finding Person X: Correlating Names with Visual Appearances

Abstract: Abstract. People as news subjects carry rich semantics in broadcast news video and therefore finding a named person in the video is a major challenge for video retrieval. This task can be achieved by exploiting the multi-modal information in videos, including transcript, video structure, and visual features. We propose a comprehensive approach for finding specific persons in broadcast news videos by exploring various clues such as names occurred in the transcript, face information, anchor scenes, and most impo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
43
1
1

Year Published

2005
2005
2010
2010

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(45 citation statements)
references
References 6 publications
(6 reference statements)
0
43
1
1
Order By: Relevance
“…This baseline allows us to assess to what extent the visual processing improves accuracy over the use of text alone. It is interesting to note that in previous work [8] which combined transcripts of news footage with Eigenface-based face recognition, only small improvements in accuracy were obtained by incorporating visual face recognition.…”
Section: Naming Accuracymentioning
confidence: 98%
See 1 more Smart Citation
“…This baseline allows us to assess to what extent the visual processing improves accuracy over the use of text alone. It is interesting to note that in previous work [8] which combined transcripts of news footage with Eigenface-based face recognition, only small improvements in accuracy were obtained by incorporating visual face recognition.…”
Section: Naming Accuracymentioning
confidence: 98%
“…In [8], transcripts (spoken text without the identity of the speaker) and video of news footage were combined to recognize faces. Much attention was directed at how to predict from a name appearing in the transcript (typically spoken by a news anchor-person) when (relatively) the person referred to might appear in the video; addition of a standard face recognition method to this information gave small improvements in accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…Face information has been combined with text information for face or person retrieval in video [21,111,53,112,84]. In [111], multimodal content in videos, including names occurred in the transcript, face information, anchor scenes, and the timing pattern between names and appearances of people, are exploited to find specific persons in broadcast news videos.…”
Section: Face Retrieval In Videomentioning
confidence: 99%
“…In [111], multimodal content in videos, including names occurred in the transcript, face information, anchor scenes, and the timing pattern between names and appearances of people, are exploited to find specific persons in broadcast news videos. However, face information was given very small weight in their system.…”
Section: Face Retrieval In Videomentioning
confidence: 99%
“…The recent researches show that combining textual and visual information increases the correctness of face-name association [1,2] and this method is also used in this study in order to name the faces in their large dataset. Images that are taken from news images with their associated captions are named with simple natural language and clustering 50 techniques.…”
Section: Related Workmentioning
confidence: 99%