2022
DOI: 10.48550/arxiv.2203.05243
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach

Abstract: Temporal Sentence Grounding in Videos (TSGV), which aims to ground a natural language sentence that indicates complex human activities in an untrimmed video, has drawn widespread attention over the past few years. However, recent studies have found that current benchmark datasets may have obvious moment annotation biases, enabling several simple baselines even without training to achieve state-of-the-art (SOTA) performance. In this paper, we take a closer look at existing evaluation protocols for TSGV, and fin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 46 publications
0
0
0
Order By: Relevance
“…Some other methods attempt to view the VG problem from the perspective of causality and debias the base model with causal intervention(Yang et al 2021;Lan et al 2022;Bao and Mu 2022). To the best of our knowledge, our method is the first to adopt curriculum learning-based data augmentation for debiased video grounding, which is orthogonal to existing methods.…”
mentioning
confidence: 99%
“…Some other methods attempt to view the VG problem from the perspective of causality and debias the base model with causal intervention(Yang et al 2021;Lan et al 2022;Bao and Mu 2022). To the best of our knowledge, our method is the first to adopt curriculum learning-based data augmentation for debiased video grounding, which is orthogonal to existing methods.…”
mentioning
confidence: 99%