Abstract:Top down image semantics play a major role in predicting where people look in images. Present state-of-the-art approaches to model human visual attention incorporate high level object detections signifying top down image semantics in a separate channel along with other bottom up saliency channels. However, multiple objects in a scene are competing to attract our attention and this interaction is ignored in current models. To overcome this limitation, we propose a novel object context based visual attention mod… Show more
“…Moreover, investing in this project within saliency detection would be a good opportunity to merge some of the Group’s research on both low-level segmentation and high-level face detection. The idea to combine high-level face detection with low-level saliency detection has already been proposed in image-processing papers ( Borji, 2012 ; Karthikeyan et al, 2013 ). But the Group’s ambition here is to go further in the saliency direction as framed by Wang and Li (2008) , after Liu et al (2007) , by proposing an algorithm capable of detecting and segmenting the contours of faces.…”
Section: Reformulating the Saliency Problemmentioning
This article documents the practical efforts of a group of scientists designing an image-processing algorithm for saliency detection. By following the actors of this computer science project, the article shows that the problems often considered to be the starting points of computational models are in fact provisional results of time-consuming, collective and highly material processes that engage habits, desires, skills and values. In the project being studied, problematization processes lead to the constitution of referential databases called ‘ground truths’ that enable both the effective shaping of algorithms and the evaluation of their performances. Working as important common touchstones for research communities in image processing, the ground truths are inherited from prior problematization processes and may be imparted to subsequent ones. The ethnographic results of this study suggest two complementary analytical perspectives on algorithms: (1) an ‘axiomatic’ perspective that understands algorithms as sets of instructions designed to solve given problems computationally in the best possible way, and (2) a ‘problem-oriented’ perspective that understands algorithms as sets of instructions designed to computationally retrieve outputs designed and designated during specific problematization processes. If the axiomatic perspective on algorithms puts the emphasis on the numerical transformations of inputs into outputs, the problem-oriented perspective puts the emphasis on the definition of both inputs and outputs.
“…Moreover, investing in this project within saliency detection would be a good opportunity to merge some of the Group’s research on both low-level segmentation and high-level face detection. The idea to combine high-level face detection with low-level saliency detection has already been proposed in image-processing papers ( Borji, 2012 ; Karthikeyan et al, 2013 ). But the Group’s ambition here is to go further in the saliency direction as framed by Wang and Li (2008) , after Liu et al (2007) , by proposing an algorithm capable of detecting and segmenting the contours of faces.…”
Section: Reformulating the Saliency Problemmentioning
This article documents the practical efforts of a group of scientists designing an image-processing algorithm for saliency detection. By following the actors of this computer science project, the article shows that the problems often considered to be the starting points of computational models are in fact provisional results of time-consuming, collective and highly material processes that engage habits, desires, skills and values. In the project being studied, problematization processes lead to the constitution of referential databases called ‘ground truths’ that enable both the effective shaping of algorithms and the evaluation of their performances. Working as important common touchstones for research communities in image processing, the ground truths are inherited from prior problematization processes and may be imparted to subsequent ones. The ethnographic results of this study suggest two complementary analytical perspectives on algorithms: (1) an ‘axiomatic’ perspective that understands algorithms as sets of instructions designed to solve given problems computationally in the best possible way, and (2) a ‘problem-oriented’ perspective that understands algorithms as sets of instructions designed to computationally retrieve outputs designed and designated during specific problematization processes. If the axiomatic perspective on algorithms puts the emphasis on the numerical transformations of inputs into outputs, the problem-oriented perspective puts the emphasis on the definition of both inputs and outputs.
“…Note that the ground truth in our dataset has multi-level values. For machine-learningbased methods, such as Judd et al, 15 Borji et al 32 and SC, 33 we train a new model by using our dataset and adopt linear regression with 10-fold cross-validation.…”
Section: Resultsmentioning
confidence: 99%
“…8c. We also replace the face-detection feature, used in SC 33 and Borji et al, 32 with our face-importance map, and we train the model again using regression and a 10-fold cross-validation. As illustrated in Fig.…”
Section: The Effectiveness Of the Face-importance Mapmentioning
In this work we study the varying importance of faces in images. Face importance is found to be affected by the size and number of faces present. We collected a dataset of 152 face images with faces in different size and number of faces. We conducted a crowdsourcing experiment where we asked people to label the important regions of the images. Analyzing the results from the experiment, we propose a simple face-importance model, which is a 2D Gaussian function, to quantitatively represent the influence of the size and number of faces on the perceived importance of faces. The face-importance model is then tested for the application of salient-object detection. For this application, we create a new salient-objects dataset, consisting of both face images and non-face images, and also through crowdsourcing we collect the ground truth. We demonstrate that our face-importance model helps us to better locate the important, thus salient, objects in the images and outperforms state-of-the-art salient-object detection algorithms.
“…Human inspired visual attention modeling [21,17,14,22,6,7,8,24] has been a well-researched topic in over a decade. Recently there has been significant interest [30,36,43,33,42,29,25,31,44,45,46,38] in eye tracking enhanced computer vision.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.