2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) 2019
DOI: 10.1109/iccvw.2019.00017
|View full text |Cite
|
Sign up to set email alerts
|

Deep Adaptive Fusion Network for High Performance RGBT Tracking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
51
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 111 publications
(52 citation statements)
references
References 16 publications
1
51
0
Order By: Relevance
“…The backbone network of the CEDiMP tracking framework is ResNet50 [ 58 ], but only the first 4 blocks are used. In order to make the feature representation model of the tracker obtain the powerful representation capabilities of multi-modal common features and single-modal unique features, we perform channel exchanging operations in the backbone network of the RGB and TIR modalities when completing feature extraction tasks.…”
Section: Methodsmentioning
confidence: 99%
“…The backbone network of the CEDiMP tracking framework is ResNet50 [ 58 ], but only the first 4 blocks are used. In order to make the feature representation model of the tracker obtain the powerful representation capabilities of multi-modal common features and single-modal unique features, we perform channel exchanging operations in the backbone network of the RGB and TIR modalities when completing feature extraction tasks.…”
Section: Methodsmentioning
confidence: 99%
“…DAPNet [8] simultaneously integrates all the chosen layers and modalities by using an adaptive fusion module recursively, achieving feature pruning by global average pooling to reduce redundancy and noise. Further, DAFNet [49] integrates the adaptive fusion sub-network from [48] and realizes quality-aware fusion in each layer between the two modalities. To achieve fine-grained fusion, MANet [7] divides features into three categories (modality-specific, modality-shared, and instance-specific), exploiting a more exhaustive featurelevel fusion approach to get robust feature representation by multiple adaptors.…”
Section: Tracking With Multiple Modalitiesmentioning
confidence: 99%
“…A multi-branch architecture is employed in CAT [54] for target appearance modelling in allusion to the modalityspecific and modality-shared challenges, which totally count to 5. Unlike DAPNet [8] and DAFNet [49] keep only the fused features retained, TFNet [55] designs a trident architecture for better excavating the modality-specific information. Constraint to the limited RGBT data, the trackers aforementioned are all equipped with lightweight feature extractors.…”
Section: Tracking With Multiple Modalitiesmentioning
confidence: 99%
“…DAPNet [15] achieves the fusion task by recurrently using a sub-network, which consists of a convolutional layer, a Rectified Linear Unit (ReLU) activation [16] and a normalization layer, at different feature levels. Compared with the coarse fusion sub-network used in DAPNet, DAFNet [17] further designs an adaptive fusion module which is similar to that in FANet. Carefully programmed, MANet [18] expects to extract the modality-specific, modality-shared and object-specific clues through modality, generator and instance adapters respectively.…”
Section: A Mdnet-based Rgbt Trackersmentioning
confidence: 99%