RGB-D salient object detection (SOD) usually describes two modes' classification or regression problem, namely RGB and depth. The existing RGB-D SOD methods use depth hints to increase the detection performance, meanwhile they focus on the quality of little depth maps. In practical application, the interference of various problems in the acquisition process affects the depth map quality, which dramatically reduces the detection effect. In this paper, to minimize interference in depth mapping and emphasize prominent objects in RGB images, we put forward a layered interactive attention network (LIANet). In general, this network consists of three essential parts: feature coding, layered fusion mechanism, and feature decoding. In the feature coding stage, three-dimensional weight is introduced to the features of each layer without adding network parameters, and it is also a lightweight module. The layered fusion mechanism is the most critical part of this paper. RGB and depth maps are used alternately for layered interaction and fusion to enhance RGB feature information and gradually integrate global context information at a single scale. In addition, we also used mixed losses to optimize further and train our model. Finally, a mass of experiments on six standard datasets demonstrated the importance of the method, and a timely detection speed reaches 30 fps on every dataset.
Owing to the renaissance of deep convolutional neural networks (CNN), salient object detection based on fully convolutional neural networks (FCNs) has attracted widespread attention. However, the scale variation of prominent objects, complex background features and fuzzy edges have historically been a great challenge to us. All these are closely associated with the utilization of multi‐level and multi‐scale features. At the same time, deep learning methods meet the challenges of computation and memory consumption in practice. To address these problems, the authors propose a different salient object detection method based on residuals learning and dense fusion learning framework. The proposed network is named Residual Dense Collaborative Network (RDCNet). First of all, the authors design a multi‐layer residual learning (MRL) module to extract salient object features in more detail, getting the utmost out of the object's multi‐scale and multi‐level information. Then, on the basis of the vigoroso stage‐wise convolution feature, the authors put forward the dilated convolution module (DCM) to acquire a rough global saliency map. Finally, the final accurate saliency detection map is obtained through dense cooperation learning (DCL), and the remaining learning is also used to improve gradually, so as to achieve high compactness and high‐efficiency results. Experimental results show that this method is the most advanced method for five widely used datasets (DUTS‐TE, HKU‐IS, PASCAL‐S, ECSSD, DUT‐OMRON) without any pre‐processing and post‐processing. Especially on the ECSSD dataset, the F‐measure of RDCNet achieves 95.2%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.