“…Many CNN architectures have been proposed in the area of image classification [25,35,12,13,49], where a deep CNN model is able to achieve a higher accuracy for classification 1 . However, CNN models used in state-ofthe-art saliency applications are relatively shallow, such as VGGNet-16 [27,41,8,16,26,37,31] or ResNet-50 [29,9]. In the work of [14], the deeper model, GoogleNet, did not achieve better performance due to the limited training set.…”