Video Similarity and Alignment Learning on Partial Video Copy Detection

Han, Zhen; He, Xiangteng; Tang, Mingqian; Lv, Yiliang

doi:10.1145/3474085.3475549

Cited by 18 publications

(11 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We have also removed the optimization components for a strongest accuracy. The F 1 score has been used as it is common to characterize the PVCD methods [9,4,19,6,5,15].…”

Section: Performance Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

A Large-scale TV Dataset for Partial Video Copy Detection

Delalandre

Conte

2022

Image Analysis and Processing – ICIAP 2022

View full text Add to dashboard Cite

This paper is interested with the performance evaluation of the partial video copy detection. Several public datasets exist designed from web videos. The detection problem is inherent to the continuous video broadcasting. The alternative is then to process with TV datasets offering a deeper scalability and a control of degradations for a fine performance evaluation. We propose in this paper a TV dataset called STVD. It is designed with a protocol ensuring a scalable capture and robust groundtruthing. STVD is the largest public dataset on the task with a near 83k videos having a total duration of 10, 660 hours. Performance evaluation results of representative methods on the dataset are reported in the paper for a baseline comparison.

show abstract

“…We have also removed the optimization components for a strongest accuracy. The F 1 score has been used as it is common to characterize the PVCD methods [9,4,19,6,5,15].…”

Section: Performance Evaluationmentioning

confidence: 99%

“…It is a well-known topic in the computer vision field [12]. The recent works investigate the detection methods robust to the spatial & temporal deformations [6,5,15] or real-time [4,13,19]. A key aspect for any computer vision task is to design public datasets for performance evaluation.…”

Section: Introductionmentioning

confidence: 99%

A Large-scale TV Dataset for Partial Video Copy Detection

Delalandre

Conte

2022

Image Analysis and Processing – ICIAP 2022

View full text Add to dashboard Cite

show abstract

“…Previous segment-level evaluation metrics are introduced with MUSCLE-VCD [15] and VCDB datasets [11]. Most of recent research works [7][8][9] adopt segment precision and recall defined in VCDB as follows:…”

Section: Datasets and Evaluationmentioning

confidence: 99%

“…In most cases, video-level copy detection results alone are not sufficient as the detected videos are usually displayed and interacted with system users for downstream tasks. Hence, designing an approach that can locate the copied segments is preferred and has already attracted lots of attentions in recent works [7][8][9][10][11].…”

Section: Introductionmentioning

confidence: 99%

A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection

He¹,

Yang²,

Jiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this paper, we introduce VCSL (Video Copy Segment Localization), a new comprehensive segment-level annotated video copy dataset. Compared with existing copy detection datasets restricted by either video-level annotation or small-scale, VCSL not only has two orders of magnitude more segment-level labelled data, with 160k realistic video copy pairs containing more than 280k localized copied segment pairs, but also covers a variety of video categories and a wide range of video duration. All the copied segments inside each collected video pair are manually extracted and accompanied by precisely annotated starting and ending timestamps. Alongside the dataset, we also propose a novel evaluation protocol that better measures the prediction accuracy of copy overlapping segments between a video pair and shows improved adaptability in different scenarios. By benchmarking several baseline and state-of-the-art segment-level video copy detection methods with the proposed dataset and evaluation metric, we provide a comprehensive analysis that uncovers the strengths and weaknesses of current approaches, hoping to open up promising directions for future works. The VCSL dataset, metric and benchmark codes are all publicly available at https://github.com/alipay/VCSL.

show abstract

“…Kordopatis et al [10] calculate video-to-video similarity by refined frame-to-frame similarity matrices. Han et.al [11] modelled the video similarity as the mask map predicted from frame-level spatial similarity.…”

Section: Related Workmentioning

confidence: 99%

STAR-GNN: Spatial-Temporal Video Representation for Content-based Retrieval

Zhao¹,

Zhang²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

We propose a video feature representation learning framework called STAR-GNN, which applies a pluggable graph neural network component on a multi-scale lattice feature graph. The essence of STAR-GNN is to exploit both the temporal dynamics and spatial contents as well as visual connections between regions at different scales in the frames. It models a video with a lattice feature graph in which the nodes represent regions of different granularity, with weighted edges that represent the spatial and temporal links. The contextual nodes are aggregated simultaneously by graph neural networks with parameters trained with retrieval triplet loss. In the experiments, we show that STAR-GNN effectively implements a dynamic attention mechanism on video frame sequences, resulting in the emphasis for dynamic and semantically rich content in the video, and is robust to noise and redundancies. Empirical results show that STAR-GNN achieves state-of-the-art performance for Content-Based Video Retrieval.

show abstract

Video Similarity and Alignment Learning on Partial Video Copy Detection

Cited by 18 publications

References 27 publications

A Large-scale TV Dataset for Partial Video Copy Detection

A Large-scale TV Dataset for Partial Video Copy Detection

A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection

STAR-GNN: Spatial-Temporal Video Representation for Content-based Retrieval

Contact Info

Product

Resources

About