Memorability of media content such as images and videos has recently become an important research subject in computer vision. This paper presents our computation model for predicting image memorability, which is based on a deep learning architecture designed for a classification task. We exploit the use of both convolutional neural network (CNN) -based visual features and semantic features related to image captioning for the task. We train and test our model on the large-scale benchmarking memorability dataset: LaMem. Experiment result shows that the proposed computational model obtains better prediction performance than the state of the art, and even outperforms human consistency. We further investigate the genericity of our model on other memorability datasets. Finally, by validating the model on interestingness datasets, we reconfirm the uncorrelation between memorability and interestingness of images.Index Terms-Image memorability, computational model, deep learning, interestingness, image captioning
Interestingness is the quantification of the ability of an image to induce interest in a user. Because defining and interpreting interestingness remain unclear in the literature, we introduce in this paper two new notions, intra-and inter-interestingness, and investigate a novel set of dedicated experiments.More specifically, we propose four experimental protocols: 1/ object ranking with a pre-defined word list, 2/ pair-wise comparison, 3/ image ranking and 4/ eye-tracking. We take advantage of experimenting on the same dataset to draw potential links between the collected data and to state on the agreement between subjects. While we do not evidence a relationship between the local (intra) and global (inter) notions of interestingness, we do observe correlated outputs throughout the different protocols. Beyond the low or moderate values obtained from inter-rater agreement metrics, we point out the experimental reproducibility to argue about the universal nature of the interestingness notions.In addition, we bring deep insights on the relationships between interestingness and 7 other criteria, some of them already pointed out in the literature as being linked with interestingness. Unusualness and emotion seem to be the strongest enablers for interestingness. These insights are highly relevant for future work on modeling.
Recent development in video coding research deals with the use of hierarchical and/or adaptative mesh for video representation. Concurrently, transmitted bit rates have to be reduced to adapt to the network available bandwidth. Some previous works deal with adaptative node sampling according to image content. However, adaptative hierarchical proposed approaches do not optimize a compromise between distortion and bitrate: the representation coding cost is often stated but not taken into account as a constraint. Compared to these methods, this paper proposes for considering an adaptative hierarchical mesh based representation whose splitting criterion optimizes both the coding cost and the image rendering. Jointly, node value optimization, adaptative quantization, cheap coding tree and a wavelet approach are presented. To illustrate our different proposed methods, experimental results are shown and compared to the JPEG picture coding format.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.