2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) 2019
DOI: 10.1109/iccvw.2019.00278
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Modal Fusion for End-to-End RGB-T Tracking

Abstract: We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of the framework, inc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
92
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 113 publications
(92 citation statements)
references
References 58 publications
(185 reference statements)
0
92
0
Order By: Relevance
“…The mfDiMP tracker contains an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking [107]. The baseline tracker is DiMP (Discriminative Model Prediction) [8], which employs a carefully designed target prediction network trained end-to-end using a discriminative loss.…”
Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning
confidence: 99%
See 1 more Smart Citation
“…The mfDiMP tracker contains an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking [107]. The baseline tracker is DiMP (Discriminative Model Prediction) [8], which employs a carefully designed target prediction network trained end-to-end using a discriminative loss.…”
Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning
confidence: 99%
“…The baseline tracker is DiMP (Discriminative Model Prediction) [8], which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. The mfDiMP tracker fuses modalities at the feature level on both the IoU predictor and the model predictor of DiMP [107]. SiamDW-T is based on our previous work [109], and extends it with two fusion strategies for RGBT tracking.…”
Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning
confidence: 99%
“…The early RGB-T tracking algorithms [8]- [10] are based on some handcrafted features. With the development of deep learning, more and more RGB-T trackers based on deep features [12], [13], [26] are presented. These RGB-T trackers are usually designed on the basis of RGB trackers.…”
Section: Rgb-t Tracking Methodsmentioning
confidence: 99%
“…In [13], Li et al proposed a multi-adapter architecture to learn modality-shared, modality-specific, and instance-aware target representations, respectively. In addition, Zhang et al [26] introduced DiMP [33] as their baseline tracker and investigated different levels of fusion mechanisms to find the optimal fusion architecture. results demonstrated that their proposed fusion tracker significantly improved the performance of the baseline tracker with respect to unimodal tracking and achieved new state-of-the-art results.…”
Section: Rgb-t Tracking Methodsmentioning
confidence: 99%
See 1 more Smart Citation