2005 IEEE International Conference on Multimedia and Expo 2005
DOI: 10.1109/icme.2005.1521577
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Visual Features and Fusion Techniques in Automatic Detection of Concepts from News Video

Abstract: This study describes experiments on automatic detection of semantic concepts, which are textual descriptions about the digital video content. The concepts can be further used in content-based categorization and access of digital video repositories. Temporal Gradient Correlograms, Temporal Color Correlograms and Motion Activity low-level features are extracted from the dynamic visual content of a video shot. Semantic concepts are detected with an expeditious method that is based on the selection of small positi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
7
0

Year Published

2007
2007
2015
2015

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 8 publications
0
7
0
Order By: Relevance
“…Westerveld and van Gemert adopted a similar approach, although they applied dimensionality reduction techniques to the heterogeneous vectors for compressing the multimodal information (Westerveld 2000;van Gemert 2003). Similar approaches are reported in (Rautiainen et al 2004;Rautiainen and Seppdnen 2005;Snoek et al 2005). Compared to LF, EF can be more efficient as a single retrieval stage is performed, however, the dimensionality in which EF methods may work can be huge.…”
Section: Related Workmentioning
confidence: 50%
See 2 more Smart Citations
“…Westerveld and van Gemert adopted a similar approach, although they applied dimensionality reduction techniques to the heterogeneous vectors for compressing the multimodal information (Westerveld 2000;van Gemert 2003). Similar approaches are reported in (Rautiainen et al 2004;Rautiainen and Seppdnen 2005;Snoek et al 2005). Compared to LF, EF can be more efficient as a single retrieval stage is performed, however, the dimensionality in which EF methods may work can be huge.…”
Section: Related Workmentioning
confidence: 50%
“…Usually a single method is used per modality (Peinado et al 2005;Izquierdo-Beviá et al 2005;Besancon and Millet 2006;Chang and Chen 2006;Rautiainen et al 2004), although the use of multiple and heterogeneous techniques has been also studied (Escalante et al 2008b). The EF formulation, on the other hand, consists of merging the vectors corresponding to textual and visual information beforehand and then using a straight retrieval technique (Rautiainen et al 2004;Rautiainen and Seppdnen 2005;Snoek et al 2005;Westerveld 2000;van Gemert 2003;). In its basic form, EF consists of concatenating the vectors of textual and visual features.…”
Section: Related Workmentioning
confidence: 96%
See 1 more Smart Citation
“…We notice that the coder neural-network obtains superior scores to those obtained by the other systems (1,2,3,4) for all semantic concepts. This supports further the importance of feature fusion.…”
mentioning
confidence: 77%
“…Many application domains making use of video data are available: Security, digital library, interactive TV, etc... Many of those rely on video content analysis and in particular video shot classification [1,2].…”
Section: Introductionmentioning
confidence: 99%