2020
DOI: 10.1007/978-3-030-58555-6_13
|View full text |Cite
|
Sign up to set email alerts
|

URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
70
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 80 publications
(70 citation statements)
references
References 34 publications
0
70
0
Order By: Relevance
“…On the A2D-Sentences and JHMDB-Sentences [11] datasets, MTTR significantly outperforms all existing methods across all metrics. Moreover, we report strong results on the public validation set of Refer-YouTube-VOS [37], a more challenging dataset that has yet to receive attention in the literature.…”
Section: Multimodal Transformermentioning
confidence: 76%
See 4 more Smart Citations
“…On the A2D-Sentences and JHMDB-Sentences [11] datasets, MTTR significantly outperforms all existing methods across all metrics. Moreover, we report strong results on the public validation set of Refer-YouTube-VOS [37], a more challenging dataset that has yet to receive attention in the literature.…”
Section: Multimodal Transformermentioning
confidence: 76%
“…As mentioned earlier, this subset contains only the more challenging full-video expressions from the original release of Refer-YouTube-VOS. Compared with existing methods [24,37] which trained and evaluated on the full version of the dataset, our model demonstrates superior performance across all metrics despite being trained on less data and evaluated exclusively on a more challenging subset. Additionally, our method shows competitive performance compared with the methods that led in the 2021 RVOS competition [8,20].…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 94%
See 3 more Smart Citations