2012
DOI: 10.1109/tpami.2011.235
|View full text |Cite
|
Sign up to set email alerts
|

Aggregating Local Image Descriptors into Compact Codes

Abstract: This paper addresses the problem of large-scale image search. Three constraints have to be taken into account: search accuracy, efficiency, and memory usage. We first present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension. We then jointly optimize dimensionality reduction and indexing in order to obtain a precise vector comparison as well a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

15
1,186
2
27

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 1,427 publications
(1,243 citation statements)
references
References 33 publications
15
1,186
2
27
Order By: Relevance
“…Several methods have been proposed to compress the image descriptors and facilitate fast matching. [8][9][10][11][12] These methods-based on machine learning algorithms-use some form of classical or modern training-based techniques such as spectral hashing, Principle Component Analysis (PCA) or Linear Discriminant Analysis (LDA) to generate compact descriptors from the image descriptors such as SIFT or GIST. As mentioned above, while training-based methods can achieve accurate image retrieval, they are unsuited in applications where the database and the image can keep changing, necessitating repeated expensive training as new landmarks, products, etc.…”
Section: Related Workmentioning
confidence: 99%
“…Several methods have been proposed to compress the image descriptors and facilitate fast matching. [8][9][10][11][12] These methods-based on machine learning algorithms-use some form of classical or modern training-based techniques such as spectral hashing, Principle Component Analysis (PCA) or Linear Discriminant Analysis (LDA) to generate compact descriptors from the image descriptors such as SIFT or GIST. As mentioned above, while training-based methods can achieve accurate image retrieval, they are unsuited in applications where the database and the image can keep changing, necessitating repeated expensive training as new landmarks, products, etc.…”
Section: Related Workmentioning
confidence: 99%
“…We believe that, as such a system will need to distinguish between finegrained words, it will require far more than the 2000 training samples available in the IIIT-5K set. Since we use a retrieval-based approach for text recognition, there is abundant literature on large-scale retrieval that can be leveraged for this task, for example on compressing histogram descriptors [5].…”
Section: Resultsmentioning
confidence: 99%
“…These patch statistics are then aggregated at an image level. We choose to compute the patch statistics using the Fisher Vector (FV) principle [19], since it obtained state-of-the-art results in image retrieval [5] and classification [2]. We assume that we have a generative model of patches (a Gaussian Mixture Model in our case) and measure the gradient of the log-likelihood of the descriptor with respect to the model parameters.…”
Section: Image Embeddingmentioning
confidence: 99%
See 1 more Smart Citation
“…The image category is estimated via the majority voting on the decision of each region classifier. The region level representation is obtained by densely extracting the SURF descriptor over the region and then using the VLAD encoding defined in Jegou et al (2012).…”
Section: Methods Facing Taskmentioning
confidence: 99%