Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval 2017
DOI: 10.1145/3078971.3079041
|View full text |Cite
|
Sign up to set email alerts
|

Query and Keyframe Representations for Ad-hoc Video Search

Abstract: This paper presents a fully-automatic method that combines video concept detection and textual query analysis in order to solve the problem of ad-hoc video search. We present a set of NLP steps that cleverly analyse different parts of the query in order to convert it to related semantic concepts, we propose a new method for transforming concept-based keyframe and query representations into a common semantic embedding space, and we show that our proposed combination of concept-based representations with their c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1
1

Relationship

5
3

Authors

Journals

citations
Cited by 47 publications
(30 citation statements)
references
References 11 publications
0
30
0
Order By: Relevance
“…Concept based methods [18,24,25,31,41] mainly rely on establishing cross-modal associations via concepts [12]. Markatopoulou et al [24,25] first utilized relatively complex linguistic rules to extract relevant concepts from a given query and used pre-trained CNNs to detect the objects and scenes in video frames. Then the similarity between a given query and a specific video is measured by concept matching.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Concept based methods [18,24,25,31,41] mainly rely on establishing cross-modal associations via concepts [12]. Markatopoulou et al [24,25] first utilized relatively complex linguistic rules to extract relevant concepts from a given query and used pre-trained CNNs to detect the objects and scenes in video frames. Then the similarity between a given query and a specific video is measured by concept matching.…”
Section: Related Workmentioning
confidence: 99%
“…Existing efforts on video retrieval with complex queries can be roughly categorized into two groups: 1) Concept-based paradigm [18,24,25,31,41,52,53], as shown in Figure 1 (a). It usually uses a large set of visual concepts to describe the video content, then transforms the text query into a set of primitive concepts, and finally performs video retrieval by aggregating the matching results from different concepts [53].…”
Section: Introductionmentioning
confidence: 99%
“…The video is decomposed into elementary temporal segments (shots) with the method of Apostolidis et al 13 Then, each shot is annotated with high-level visual concepts coming from the same pre-specified concept pool used for describing the lecture videos. This pool comprises the 346 concepts defined in the TRECVID SIN task (as in Markatopoulou et al 14 ), but is easily extendible to additional concepts for which training data are available (e.g., ImageNet). We use state-of-the-art deep-learning techniques such as Deep Convolutional Neural Network (DCNN) architectures.…”
Section: Video Processingmentioning
confidence: 99%
“…This means finding which non-lecture videos are most closely related to a given lecture video. This is realized in a direct analogy to how Markatopoulou et al 14 use semantic word embeddings to match the concept-based representations of textual queries and videos for performing video retrieval.…”
Section: Video Processingmentioning
confidence: 99%
“…This module formulates and expands an input query in order to translate it into a set of high-level concepts C Q , as proposed in [9]. First, we search for one or more high-level concepts that are semantically similar to the entire query, using the Explicit Semantic Analysis (ESA) measure [10].…”
Section: Automatic Query Formulation and Expansion Using High-level Cmentioning
confidence: 99%