Most methods for RGB-D salient object detection (SOD) utilize the same fusion strategy to explore the cross-modal complementary information at each level. However, this may ignore di erent feature contributions from two modalities on di erent levels towards prediction. In this paper, we propose a novel top-down multi-level fusion structure where di erent fusion strategies are utilized to e ectively explore the low-level and high-level features. This is achieved by designing the interweave fusion module (IFM) to effectively integrate the global information and designing the gated select fusion module (GSFM) to discriminatively select useful local information by ltering out the unnecessary one from RGB and depth data. Moreover, we propose an adaptive fusion module (AFM) to reintegrate the fused cross-modal features of each level to predict a more accurate result. Comprehensive experiments on 7 challenging benchmark datasets demonstrate that our method achieves the competitive performance over 14 state-of-the-art RGB-D alternative methods. CCS CONCEPTS • Computing methodologies → Interest point and salient region detections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.