2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.38
|View full text |Cite
|
Sign up to set email alerts
|

SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

4
421
0
1

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 510 publications
(437 citation statements)
references
References 29 publications
4
421
0
1
Order By: Relevance
“…Saliency models have been developed for visual attention modeling [9], [26], [27], [28] and salient object detection [29], [17], [30]. The former task aims to predict human fixation locations on natural images, while the proposed method aims to compute the pixel-wise saliency values for capturing the regions of the salient objects.…”
Section: A Saliency Detection Methodsmentioning
confidence: 99%
“…Saliency models have been developed for visual attention modeling [9], [26], [27], [28] and salient object detection [29], [17], [30]. The former task aims to predict human fixation locations on natural images, while the proposed method aims to compute the pixel-wise saliency values for capturing the regions of the salient objects.…”
Section: A Saliency Detection Methodsmentioning
confidence: 99%
“…The saliency loss L S is defined as the pixel-level content loss between  and A , which is used to train the generator of the attention maps. Two loss functions were experimented with in the models shown in Table 1: M-SEN MSE uses the mean squared error (MSE) loss, a base-line loss as it has been used in many visual saliency prediction works [11]; M-SEN BCE uses binary cross-entropy (BCE) loss, which is mathematically equivalent to Kullback-Leibler divergence, arguably the best metric to measure saliency prediction performance [12]. For the classification task, cross-entropy loss was used as L C , the same as in [5].…”
Section: Methodsmentioning
confidence: 99%
“…We constructed a ground truth saliency mask G , which consisted of binary values corresponding to the size of glimpse g t , with elements in G equal to one at location where the bounding box overlaps with g t and equal to zero otherwise. The Kullback–Leibler (KL) divergence was utilized to update the parameters of L sal since the KL divergence was shown to be more effective than other metrics in training networks to predict saliency . Regarding the saliency map as a probability distribution of saliency, we can compute the KL divergence L sal between the predicted saliency mask γ t and the ground truth distribution G as follows:Lsal=bold-italicGlogGγbold-italict,where L sal is to be minimized.…”
Section: Methodsmentioning
confidence: 99%