2022
DOI: 10.1007/978-3-031-19806-9_14
|View full text |Cite
|
Sign up to set email alerts
|

Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 81 publications
0
1
0
Order By: Relevance
“…Learning tracking from static images. Since labelled video data is expensive to acquire at scale, recent methods have proposed to use static images to supervise MOT methods [16,66,74,77]. CenterTrack [77] proposes to learn motion offsets from static images by random translation of the input, while FairMOT [74] treats objects in a dataset of static images as unique classes to distinguish.…”
Section: Related Workmentioning
confidence: 99%
“…Learning tracking from static images. Since labelled video data is expensive to acquire at scale, recent methods have proposed to use static images to supervise MOT methods [16,66,74,77]. CenterTrack [77] proposes to learn motion offsets from static images by random translation of the input, while FairMOT [74] treats objects in a dataset of static images as unique classes to distinguish.…”
Section: Related Workmentioning
confidence: 99%
“…Learning tracking from static images. Since labelled video data is expensive to acquire at scale, recent methods have proposed to use static images to supervise MOT methods [18,69,77,80]. CenterTrack [80] proposes to learn motion offsets from static images by random translation of the input, while FairMOT [77] treats objects in a dataset of static images as unique classes to distinguish.…”
Section: Related Workmentioning
confidence: 99%
“…Object Detection. Traditional detection models are trained to detect objects for a pre-defined set of categories [4,52,53,67]. As a result, traditional models find it challenging to adapt to new tasks and domains, unable to differentiate between objects that vary in attributes such as texture, shape, and other characteristics.…”
Section: Related Workmentioning
confidence: 99%