SIFT Meets CNN: A Decade Survey of Instance Retrieval

Zheng, Liang; Yang, Yi; Tian, Qi

doi:10.1109/tpami.2017.2709749

Cited by 615 publications

(323 citation statements)

References 149 publications

Supporting

Mentioning

314

Contrasting

Unclassified

Order By: Relevance

“…Instead of designing visual features manually, Convolutional Neural Network (CNN) can automatically learn deep representations of images [12]. Several researchers have also applied CNN to image sentiment classification [13]- [16] and demostrated the superior performance of the deep features against hand-tuned features for sentiment classification.…”

Section: Introductionmentioning

confidence: 99%

Visual Sentiment Prediction Based on Automatic Discovery of Affective Regions

Yang

She

Sun

et al. 2018

IEEE Trans. Multimedia

145

View full text Add to dashboard Cite

Abstract-Automatic assessment of sentiment from visual content has gained considerable attention with the increasing tendency of expressing opinions via images and videos online. This paper investigates the problem of visual sentiment analysis, which involves a high-level abstraction in the recognition process. While most of the current methods focus on improving holistic representations, we aim to utilize the local information, which is inspired by the observation that both the whole image and local regions convey significant sentiment information. We propose a framework to leverage affective regions, where we first use an off-the-shelf objectness tool to generate the candidates, and employ a candidate selection method to remove redundant and noisy proposals. Then a convolutional neural network (CNN) is connected with each candidate to compute the sentiment scores, and the affective regions are automatically discovered, taking the objectness score as well as the sentiment score into consideration. Finally, the CNN outputs from local regions are aggregated with the whole images to produce the final predictions. Our framework only requires image-level labels, thereby significantly reducing the annotation burden otherwise required for training. This is especially important for sentiment analysis as sentiment can be abstract, and labeling affective regions is too subjective and labor-consuming. Extensive experiments show that the proposed algorithm outperforms the state-of-the-art approaches on eight popular benchmark datasets.

show abstract

Section: Introductionmentioning

confidence: 99%

Visual Sentiment Prediction Based on Automatic Discovery of Affective Regions

Yang

She

Sun

et al. 2018

IEEE Trans. Multimedia

145

View full text Add to dashboard Cite

show abstract

“…ImageNet, are used as feature extractors by feedforwarding the image of interest, and gathering the activations at different layers of the network [13], [44], [53], [48], [39], [55]. The penultimate activations before softmax classifier have been reported as good baselines for transferring knowledge in several vision tasks [13], [44].…”

Section: B Methodologymentioning

confidence: 99%

Maya Codical Glyph Segmentation: A Crowdsourcing Approach

Can

Odobez

Gática-Pérez

2018

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Abstract-This paper focuses on the crowd-annotation of an ancient Maya glyph dataset derived from the three ancient codices that survived up to date. More precisely, non-expert annotators are asked to segment glyph-blocks into their constituent glyph entities. As a means of supervision, available glyph variants are provided to the annotators during the crowdsourcing task. Compared to object recognition in natural images or handwriting transcription tasks, designing an engaging task and dealing with crowd behavior is challenging in our case. This challenge originates from the inherent complexity of Maya writing and an incomplete understanding of the signs and semantics in the existing catalogs. We elaborate on the evolution of the crowdsourcing task design, and discuss the choices for providing supervision during the task. We analyze the distributions of similarity and task difficulty scores, and the segmentation performance of the crowd. A unique dataset of over 9000 Maya glyphs from 291 categories individually segmented from the three codices was created and will be made publicly available thanks to this process. This dataset lends itself to automatic glyph classification tasks. We provide baseline methods for glyph classification using traditional shape descriptors and convolutional neural networks.

show abstract

“…Oxford 5k dataset is used for image retrieval [38]. This dataset contains 5062 images, denoted as I = {a 1 , a 2 , .…”

Section: A Datasetsmentioning

confidence: 99%

Performance Evaluation of SIFT and Convolutional Neural Network for Image Retrieval

Sachdeva¹,

Baber²,

Bakhtyar³

et al. 2017

ijacsa

View full text Add to dashboard Cite

Abstract-Convolutional Neural Network (NN) has gained a lot of attention of the researchers due to its high accuracy in classification and feature learning. In this paper, we evaluated the performance of CNN used as feature for image retrieval with the gold standard feature, aka SIFT. Experiments are conducted on famous Oxford 5k data-set. The mAP of SIFT and CNN is 0.6279 and 0.5284, respectively. The performance of CNN is also compared with bag of visual word (BoVW) model. CNN achieves better accuracy than BoVW.

show abstract

SIFT Meets CNN: A Decade Survey of Instance Retrieval

Cited by 615 publications

References 149 publications

Visual Sentiment Prediction Based on Automatic Discovery of Affective Regions

Visual Sentiment Prediction Based on Automatic Discovery of Affective Regions

Maya Codical Glyph Segmentation: A Crowdsourcing Approach

Performance Evaluation of SIFT and Convolutional Neural Network for Image Retrieval

Contact Info

Product

Resources

About