2023
DOI: 10.1016/j.inffus.2023.101881
|View full text |Cite
|
Sign up to set email alerts
|

Exploring fusion strategies for accurate RGBT visual object tracking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 19 publications
(2 citation statements)
references
References 59 publications
0
2
0
Order By: Relevance
“…As a result, recent researches have been dominated by deep learning techniques. In recent studies, ranging from the simplest operation, concatenation (Zhang et al 2019), to the more complicated transformer architecture (Hui et al 2023;Zhu et al 2023a), the researchers have tried various fusion strategies with multiple intentions, including learning modality importance (Zhang et al 2021b;Tang, Xu, and Wu 2022), reducing the multi-modal redundancy (Li et al 2018;Zhu et al 2019), propagating the multi-modal patterns (Wang et al 2020), learning the multi-modal prompts from the auxiliary modality (Zhu et al 2023a), to name a few. With the increment in network complexity and the availability of larger training sets, tracking results have been gradually improved.…”
Section: Rgb-t Trackersmentioning
confidence: 99%
See 1 more Smart Citation
“…As a result, recent researches have been dominated by deep learning techniques. In recent studies, ranging from the simplest operation, concatenation (Zhang et al 2019), to the more complicated transformer architecture (Hui et al 2023;Zhu et al 2023a), the researchers have tried various fusion strategies with multiple intentions, including learning modality importance (Zhang et al 2021b;Tang, Xu, and Wu 2022), reducing the multi-modal redundancy (Li et al 2018;Zhu et al 2019), propagating the multi-modal patterns (Wang et al 2020), learning the multi-modal prompts from the auxiliary modality (Zhu et al 2023a), to name a few. With the increment in network complexity and the availability of larger training sets, tracking results have been gradually improved.…”
Section: Rgb-t Trackersmentioning
confidence: 99%
“…Due to the strict demand for the robustness of tracking systems in real-world applications, such as surveillance (Lu et al 2023) and unmanned driving (Zhang et al 2023a), visual object tracking with an auxiliary modality, named as multi-modal tracking, draws growing attention recently. For example, the thermal infrared (TIR) modality provides more stable scene perception in the nighttime (Tang et al 2023), and the depth (D) modality provides 3-D perception against occlusions (Zhu et al 2023b). In other words, the use of auxiliary modalities can complement the visible image in challenging scenarios.…”
Section: Introductionmentioning
confidence: 99%