2023
DOI: 10.1109/tcsvt.2022.3202563
|View full text |Cite
|
Sign up to set email alerts
|

HRTransNet: HRFormer-Driven Two-Modality Salient Object Detection

Abstract: High-Resolution Transformer (HRFormer) can maintain high-resolution representation and share global receptive fields. It is friendly towards salient object detection (SOD) in which the input and output have the same resolution. However, two critical problems need to be solved for two-modality SOD. One problem is two-modality fusion. The other problem is the HRFormer output's fusion. To address the first problem, a supplementary modality is injected into the primary modality by using global optimization and an… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 31 publications
(4 citation statements)
references
References 130 publications
0
4
0
Order By: Relevance
“…Red and blue denote the best and the second-best results, respectively. HDFNet [29], CoNet [19], BBS-Net [13], JL-DCF-R [27], SPNet [113], CMINet [114], DCF [115] , as well as four state-of-theart transformer-based RGB-D SOD models, namely SwinNet [85], HRTransNet [86], EBMGSOD [84], and our previous VST [44], for comparison. Table 4 and Table 5 report the comparison results.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Red and blue denote the best and the second-best results, respectively. HDFNet [29], CoNet [19], BBS-Net [13], JL-DCF-R [27], SPNet [113], CMINet [114], DCF [115] , as well as four state-of-theart transformer-based RGB-D SOD models, namely SwinNet [85], HRTransNet [86], EBMGSOD [84], and our previous VST [44], for comparison. Table 4 and Table 5 report the comparison results.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…Following VST, we leverage the pre-trained T2T-ViT t -14 model [46] as our backbone to create the VST-t++ model. Moreover, some transformer-based models have been proposed for RGB SOD [83,84] and RGB-D SOD [84,85,86] with the Swin Transformer family [49] as the backbone. Following this trend, we explore three Swin Transformer models with different scales, i.e.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…DCMNet [68] and HRTransNet [69]. To ensure the fairness of the comparison, we used the saliency maps provided by the authors.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%