Multi-Modal Fusion for End-to-End RGB-T Tracking

Zhang, Lichao; Danelljan, Martin; González-García, Abel; Weijer, Joost van de; Khan, Fahad Shahbaz

doi:10.1109/iccvw.2019.00278

Cited by 113 publications

(92 citation statements)

References 58 publications

(185 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The mfDiMP tracker contains an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking [107]. The baseline tracker is DiMP (Discriminative Model Prediction) [8], which employs a carefully designed target prediction network trained end-to-end using a discriminative loss.…”

Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning

confidence: 99%

“…The baseline tracker is DiMP (Discriminative Model Prediction) [8], which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. The mfDiMP tracker fuses modalities at the feature level on both the IoU predictor and the model predictor of DiMP [107]. SiamDW-T is based on our previous work [109], and extends it with two fusion strategies for RGBT tracking.…”

Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning

confidence: 99%

See 1 more Smart Citation

The Seventh Visual Object Tracking VOT2019 Challenge Results

Kristan

Berg

Zheng

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Self Cite

351

359

View full text Add to dashboard Cite

The Visual Object Tracking challenge VOT2019 is the seventh annual tracker benchmarking activity organized by the VOT initiative. Results of 81 trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis as well as the standard VOT methodology for long-term tracking analysis. The VOT2019 challenge was composed of five challenges focusing on different tracking domains: (i) VOT-ST2019 challenge focused on short-term tracking in RGB, (ii) VOT-RT2019 challenge focused on "real-time" shortterm tracking in RGB, (iii) VOT-LT2019 focused on longterm tracking namely coping with target disappearance and reappearance. Two new challenges have been introduced: (iv) VOT-RGBT2019 challenge focused on short-term tracking in RGB and thermal imagery and (v) VOT-RGBD2019 challenge focused on long-term tracking in RGB and depth imagery. The VOT-ST2019, VOT-RT2019 and VOT-LT2019 datasets were refreshed while new datasets were introduced for VOT-RGBT2019 and VOT-RGBD2019. The VOT toolkit has been updated to support both standard shortterm, long-term tracking and tracking with multi-channel imagery. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website 1 .

show abstract

Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning

confidence: 99%

Section: C5 Multi-adapter Convolutional Network For Rgbt Tracking mentioning

confidence: 99%

The Seventh Visual Object Tracking VOT2019 Challenge Results

Kristan

Berg

Zheng

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Self Cite

351

359

View full text Add to dashboard Cite

show abstract

“…The early RGB-T tracking algorithms [8]- [10] are based on some handcrafted features. With the development of deep learning, more and more RGB-T trackers based on deep features [12], [13], [26] are presented. These RGB-T trackers are usually designed on the basis of RGB trackers.…”

Section: Rgb-t Tracking Methodsmentioning

confidence: 99%

“…In [13], Li et al proposed a multi-adapter architecture to learn modality-shared, modality-specific, and instance-aware target representations, respectively. In addition, Zhang et al [26] introduced DiMP [33] as their baseline tracker and investigated different levels of fusion mechanisms to find the optimal fusion architecture. results demonstrated that their proposed fusion tracker significantly improved the performance of the baseline tracker with respect to unimodal tracking and achieved new state-of-the-art results.…”

Section: Rgb-t Tracking Methodsmentioning

confidence: 99%

“…(2) Multi-modal feature fusion For the RGB-T tracking task, how to effectively fuse the RGB and thermal information is one of the most important issues. Several methods have been proposed, such as element-wise summation [25], concatenation [26] and contentdependency weighting based fusion strategies [12] [17]. However, most of these existing fusion strategies do not consider the feature differences between the input RGB and thermal images during fusion.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation