Proceedings of the 21st ACM International Conference on Multimedia 2013
DOI: 10.1145/2502081.2502160
|View full text |Cite
|
Sign up to set email alerts
|

Querying for video events by semantic signatures from few examples

Abstract: We aim to query web video for complex events using only a handful of video query examples, where the standard approach learns a ranker from hundreds of examples. We consider a semantic signature representation, consisting of off-the-shelf concept detectors, to capture the variance in semantic appearance of events. Since it is unknown what similarity metric and query fusion to use in such an event retrieval setting, we perform three experiments on unconstrained web videos from the TRECVID event detection task. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 37 publications
(33 citation statements)
references
References 12 publications
0
33
0
Order By: Relevance
“…2, we report our performance on the MED TEST dataset consisting of 20 events. We also compare our retrieval performance against a state of the art technique published in [17]. For clarity, we only indicate the respective average precision per event obtained using our best performing method (MNE+QR on fused motion and appearance representations).…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…2, we report our performance on the MED TEST dataset consisting of 20 events. We also compare our retrieval performance against a state of the art technique published in [17]. For clarity, we only indicate the respective average precision per event obtained using our best performing method (MNE+QR on fused motion and appearance representations).…”
Section: Resultsmentioning
confidence: 99%
“…That said, some of the noted work in recent past, pertinent to recognition in unconstrained settings include machine interpretation of either low-level features [13,26] directly extracted from human labeled event videos [18,20] or training intermediate-level semantic concepts that require expensive human annotation [1,17] or a combination of both [9,21].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…This is an effective way, but it is not applicable for cases in which no or few examples are available and the models cannot give interpretation or understanding of the semantics in the event. If few examples are available, the web is a powerful tool to get more examples [24,27].…”
Section: Complex Event Detectionmentioning
confidence: 99%