2018
DOI: 10.1109/tcyb.2017.2761775
|View full text |Cite
|
Sign up to set email alerts
|

CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion

Abstract: Salient object detection from RGB-D images aims to utilize both the depth view and RGB view to automatically localize objects of human interest in the scene. Although a few earlier efforts have been devoted to the study of this paper in recent years, two major challenges still remain: 1) how to leverage the depth view effectively to model the depth-induced saliency and 2) how to implement an optimal combination of the RGB view and depth view, which can make full use of complementary information among them. To … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
256
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 354 publications
(257 citation statements)
references
References 41 publications
0
256
0
1
Order By: Relevance
“…Table II and Fig. 4 show that all deep learning based approaches outperform traditional methods by a great margin; and endto-end frameworks, including PCA [19] and our approach, are superior to multi-stage methods such as CTMF [17] and MPCI [18]. Moreover, benefited from our fusion scheme and edge-preserving loss, the proposed method consistently improves the F-measure and MAE achieved by PCA on all three datasets, especially on NLPR where accurate depth data are collected by Kinect.…”
Section: Comparison With the State-of-the-artsmentioning
confidence: 90%
See 2 more Smart Citations
“…Table II and Fig. 4 show that all deep learning based approaches outperform traditional methods by a great margin; and endto-end frameworks, including PCA [19] and our approach, are superior to multi-stage methods such as CTMF [17] and MPCI [18]. Moreover, benefited from our fusion scheme and edge-preserving loss, the proposed method consistently improves the F-measure and MAE achieved by PCA on all three datasets, especially on NLPR where accurate depth data are collected by Kinect.…”
Section: Comparison With the State-of-the-artsmentioning
confidence: 90%
“…For a fair comparison to state-of-the-arts, we utilize the same data split as in [17]. The training set contains 1400 samples from the NJUD dataset and 650 samples from NLPR.…”
Section: A Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…For the experiments on the Caltech-101 dataset, the procedures of l 2 normalization (the step (5) in the training phase and the steps (3) and (6) in the testing phase in Algorithm 1) are adopted. For the experiments on the ILSVRC2012 dataset, these procedures are skipped, since the original softmax classifier of the base CNN is not trained over the l 2 normalized DFVs.…”
Section: Methodsmentioning
confidence: 99%
“…The critical goal for video synchronization is to establish temporal correspondences among frames of two input videos, i.e., a reference video and a video to be synchronized. The applications of video synchronization cover a wide range of video anal-ysis tasks [2][3][4][5][6][7][8], such as video surveillance, target identification, human action recognition, saliency detection and fusion.…”
Section: Introductionmentioning
confidence: 99%