2012 IEEE International Conference on Multimedia and Expo 2012
DOI: 10.1109/icme.2012.189
|View full text |Cite
|
Sign up to set email alerts
|

Video Copy Detection Using a Soft Cascade of Multimodal Features

Abstract: In the video copy detection task, it is widely recognized that none of any single feature can work well for all transformations. Thus more and more approaches adopt a set of complementary features to cope with complex audiovisual transformations. However, most of them utilize individual features separately and the final result is obtained by fusing results of several basic detectors. Often, this will lead to low detection efficiency. Moreover, there are some thresholds or parameters to be elaborately tuned. To… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 8 publications
0
5
0
1
Order By: Relevance
“…Poullot et al [37] introduced the Temporal Matching Kernel (TMK) that encodes sequences of frames with periodic kernels that take into account the frame descriptor and timestamp. A score function was introduced for video matching that maximizes both the similarity score and the [20] 0.962 Tian et al, 2015 [52] 0.952 Chou et al, 2015 [9] 0.938 Table 4.3: Multimodal approach and F1 score on TRECVID 2011 of four filter-andrefine matching methods. If the approach is not multimodal, then the F1 score is calculated based on the video transformations only.…”
Section: Researchmentioning
confidence: 99%
See 2 more Smart Citations
“…Poullot et al [37] introduced the Temporal Matching Kernel (TMK) that encodes sequences of frames with periodic kernels that take into account the frame descriptor and timestamp. A score function was introduced for video matching that maximizes both the similarity score and the [20] 0.962 Tian et al, 2015 [52] 0.952 Chou et al, 2015 [9] 0.938 Table 4.3: Multimodal approach and F1 score on TRECVID 2011 of four filter-andrefine matching methods. If the approach is not multimodal, then the F1 score is calculated based on the video transformations only.…”
Section: Researchmentioning
confidence: 99%
“…A sequential pyramid matching (SPM) algorithm was devised to localize the similar video sequences. In contrast, Jiang et al [20] presented a soft cascade framework utilizing multiple hashed features to filter out non-NDVs. They modified the SPM to introduce temporal information in a temporal pyramid matching (TPM).…”
Section: Filter-and-refine Matchingmentioning
confidence: 99%
See 1 more Smart Citation
“…多特征哈希 [1] 以及随机多角度哈希 [2] 这两种最新的基于哈希 的近重复视频检索方法都是基于多特征融合的策略. 此外, Jiang 等 [13] 利用时域金字塔匹配结构融合 多特征, 构建视频拷贝检测系统; Nie 等 [14,15] 由以上描述可知, 本文方法中的中间层和高层语义特征均来自于深度学习模型, 众所周知, 近年 来, 相关研究者提出了很多深度卷积神经网络模型, 如 VGGNet [18] , AlexNet [19] 和 GoogLeNet [20]…”
Section: 相关工作unclassified
“…If the query was not a copy it was passed to the second layer which was the DCT detector and only declared as a copy if the DCT found a match, otherwise it was finally passed to the DCSIFT as the final layer. For details see [Jian et al 2011] and [Jiang et al 2012].…”
Section: Pku-idmmentioning
confidence: 99%