A Dilated Inception Network for Visual Saliency Prediction

Yang, Shi Chun; Lin, Guosheng; Jiang, Qiuping; Wang, Lin

doi:10.1109/tmm.2019.2947352

Cited by 122 publications

(97 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We firstly clarify two similar concepts, attention model and attentionbased deep learning model. Attention models are a class of models that aim to predict task-free saliency, including human fixation prediction [8,12,17,33,40] and salient object detection [13,34,36,45,47]. While attention-based deep learning models are deep learning models that aim to enhance their representational power by predicting weights of intermediate features in a task-specific context.…”

Section: Related Work 21 Attention-based Deep Learning Modelmentioning

confidence: 99%

“…The attention weights are projected to the corresponding bounded regions in ascending order so that low-attention regions will be covered by high-attention regions when there is any overlapping. The fixation maps are generated with Yang et al's work [40]. Figure 6: IAA-driven object-level attention maps (labels are omitted) that do not fully align with human attention.…”

Section: Further Analysismentioning

confidence: 99%

“…: Left: proposed object-level attention with labels (object category and attention weights). Right: objectagnostic attention [39,40,48]. Both use higher brightness to denote more important regions.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Object-level Attention for Aesthetic Rating Distribution Prediction

Hou

Yang

Wang

2020

Proceedings of the 28th ACM International Conference on Multimedia

Self Cite

View full text Add to dashboard Cite

We study the problem of image aesthetic assessment (IAA) and aim to automatically predict the image aesthetic quality in the form of discrete distribution, which is particularly important in IAA due to its nature of having possibly higher diversification of agreement for aesthetics. Previous works show the effectiveness of utilizing objectagnostic attention mechanisms to selectively concentrate on more contributive regions for IAA, e.g., attention is learned to weight pixels of input images when inferring aesthetic values. However, as suggested by some neuropsychology studies, the basic units of human attention are visual objects, i.e., the trace of human attention follows a series of objects. This inspires us to predict contributions of different regions at object level for better aesthetics evaluation. With our framework, region-of-interests (RoIs) are proposed by an object detector, and each RoI is associated with a regional feature vector. Then the contribution of each regional feature to the aesthetics prediction is adaptively determined. To the best of our knowledge, this is the first work modeling object-level attention for IAA and experimental results confirm the superiority of our framework over previous relevant methods. CCS CONCEPTS • Applied computing → Media arts; • Computing methodologies → Computer vision; Neural networks.

show abstract

Section: Related Work 21 Attention-based Deep Learning Modelmentioning

confidence: 99%

Section: Further Analysismentioning

confidence: 99%

See 1 more Smart Citation

Object-level Attention for Aesthetic Rating Distribution Prediction

Hou

Yang

Wang

2020

Proceedings of the 28th ACM International Conference on Multimedia

Self Cite

View full text Add to dashboard Cite

show abstract

“…Liu et al 40 planned a multibranch residual module with dilated convolutions to extract multiscale features so that the classification and identification of spacecraft electronic load signals can be solved. Yang et al 41 designed an end-to-end dilated inception network (DINet) to predict visual saliency maps. The dilated inception module of the DINet used dilated convolutions with different dilation rates in parallel, which not only can significantly reduce the computational load but also can enrich the diversity of the receptive field in the features.…”

Section: Dilated Convolutionmentioning

confidence: 99%

Learning synthetic aperture radar image despeckling without clean data

Zhang

Zhi

et al. 2020

J. Appl. Rem. Sens.

View full text Add to dashboard Cite

Speckle noise can reduce the image quality of synthetic aperture radar (SAR) and make interpretation more difficult. Existing SAR image despeckling convolutional neural networks require quantities of noisy-clean image pairs. However, obtaining clean SAR images is very difficult. Because continuous convolution and pooling operations result in losing many informational details while extracting the deep features of the SAR image, the quality of recovered clean images becomes worse. Therefore, we propose a despeckling network called multiscale dilated residual U-Net (MDRU-Net). The MDRU-Net can be trained directly using noisy-noisy image pairs without clean data. To protect more SAR image details, we design five multiscale dilated convolution modules that extract and fuse multiscale features. Considering that the deep and shallow features are very distinct in fusion, we design different dilation residual skip connections, which make features at the same level have the same convolution operations. Afterward, we present an effective L_ hybrid loss function that can effectively improve the network stability and suppress artifacts in the predicted clean SAR image. Compared with the state-of-the-art despeckling algorithms, the proposed MDRU-Net achieves a significant improvement in several key metrics. © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

show abstract

“…Srinivas et al [36] devised a fully CNN for predicting human eye fixation. Yang et al [37] proposed a dilated inception network for visual saliency prediction. Lv et al [38] developed an attention-based fusion network for human eye-fixation prediction in 3-D images.…”

Section: Introductionmentioning

confidence: 99%