Existing RGB-D salient object detection (SOD) techniques concentrate on
combining data from multiple modalities (e.g., depth and RGB) and extracting
multi-scale data for improved saliency reasoning. However, they frequently
per form poorly as a factor of the drawbacks of low-quality depth maps and
the lack of correlation between the extracted multi-scale data. In this
paper, we propose a Exploring Cross-ModalWeighting and Edge-Guided Decoder
Network (ECW-EGNet) for RGB-D SOD, which includes three prominent
components. Firstly, we deploy a Cross-Modality Weighting Fusion (CMWF)
module that utilizes Channel-Spatial Attention Feature Enhancement (CSAE)
mechanism and Depth-Quality Assessment (DQA) mechanism to achieve the
cross-modal feature interaction. The former parallels channel attention and
spatial attention enhances the features of extracted RGB streams and depth
streams while the latter assesses the depth-quality reduces the detrimental
influence of the low-quality depth maps during the cross-modal fusion.
Then, in order to effectively integrate multi-scale features for high-level
and produce salient objects with precise locations, we construct a
Bi-directional Scale-Correlation Convolution (BSCC) module in a
bi-directional structure. Finally, we construct an Edge-Guided (EG) decoder
that uses the edge detection operator to obtain edge masks to guide the
enhancement of salient map edge details. The comprehensive experiments on
five benchmark RGB-D SOD datasets demonstrate that the proposed ECW-EGNet
outperforms 21 state-of-the-art (SOTA) saliency detectors in four widely
used evaluation metrics.