Aggregating Local Image Descriptors into Compact Codes

Jeǵou, Hervé; Perronnin, Florent; Douze, Matthijs; Sánchez, Jorge M. Balestena; Pérez, Patrick; Schmid, Cordelia

doi:10.1109/tpami.2011.235

Cited by 1,427 publications

(1,243 citation statements)

References 33 publications

Supporting

Mentioning

1,186

Contrasting

Unclassified

Order By: Relevance

“…Several methods have been proposed to compress the image descriptors and facilitate fast matching. [8][9][10][11][12] These methods-based on machine learning algorithms-use some form of classical or modern training-based techniques such as spectral hashing, Principle Component Analysis (PCA) or Linear Discriminant Analysis (LDA) to generate compact descriptors from the image descriptors such as SIFT or GIST. As mentioned above, while training-based methods can achieve accurate image retrieval, they are unsuited in applications where the database and the image can keep changing, necessitating repeated expensive training as new landmarks, products, etc.…”

Section: Related Workmentioning

confidence: 99%

Quantized embeddings: an efficient and universal nearest neighbor method for cloud-based image retrieval

2013

View full text Add to dashboard Cite

We propose a rate-efficient, feature-agnostic approach for encoding image features for cloudbased nearest neighbor search. We extract quantized random projections of the image features under consideration, transmit these to the cloud server, and perform matching in the space of the quantized projections. The advantage of this approach is that, once the underlying feature extraction algorithm is chosen for maximum discriminability and retrieval performance (e.g., SIFT, or eigen-features), the random projections guarantee a rate-efficient representation and fast server-based matching with negligible loss in accuracy. Using the Johnson-Lindenstrauss Lemma, we show that pair-wise distances between the underlying feature vectors are preserved in the corresponding quantized embeddings. We report experimental results of image retrieval on two image databases with different feature spaces; one using SIFT features and one using face features extracted using a variant of the Viola-Jones face recognition algorithm. For both feature spaces, quantized embeddings enable accurate image retrieval SPIE Conference on Applications of Digital Image Processing XXXVIThis work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. ABSTRACTWe propose a rate-efficient, feature-agnostic approach for encoding image features for cloud-based nearest neighbor search. We extract quantized random projections of the image features under consideration, transmit these to the cloud server, and perform matching in the space of the quantized projections. The advantage of this approach is that, once the underlying feature extraction algorithm is chosen for maximum discriminability and retrieval performance (e.g., SIFT, or eigen-features), the random projections guarantee a rate-efficient representation and fast server-based matching with negligible loss in accuracy. Using the Johnson-Lindenstrauss Lemma, we show that pair-wise distances between the underlying feature vectors are preserved in the corresponding quantized embeddings. We report experimental results of image retrieval on two image databases with different feature spaces; one using SIFT features and one using face features extracted using a variant of the Viola-Jones face recognition algorithm. For both feature spaces, quantized embeddings enable accurate image retrieval combined with improved bit-rate efficiency and speed of matching, when compared with the underlying featur...

show abstract

Section: Related Workmentioning

confidence: 99%

Quantized embeddings: an efficient and universal nearest neighbor method for cloud-based image retrieval

2013

View full text Add to dashboard Cite

show abstract

“…We believe that, as such a system will need to distinguish between finegrained words, it will require far more than the 2000 training samples available in the IIIT-5K set. Since we use a retrieval-based approach for text recognition, there is abundant literature on large-scale retrieval that can be leveraged for this task, for example on compressing histogram descriptors [5].…”

Section: Resultsmentioning

confidence: 99%

“…These patch statistics are then aggregated at an image level. We choose to compute the patch statistics using the Fisher Vector (FV) principle [19], since it obtained state-of-the-art results in image retrieval [5] and classification [2]. We assume that we have a generative model of patches (a Gaussian Mixture Model in our case) and measure the gradient of the log-likelihood of the descriptor with respect to the model parameters.…”

Section: Image Embeddingmentioning

confidence: 99%

“…retrieval applications, where a query image is inputted and the goal is to rank the images of a training set in descending order of similarity. The dot product between Fisher vectors is a standard way to perform the comparison [5]. Sinceθ = W T θ is the representation of the image features θ in the projected space, then k(θ ,θ ) = θ T WW T θ is a similarity between images.…”

Section: Using Label Embedding For Similarity Learningmentioning

confidence: 99%

See 1 more Smart Citation

Label embedding for text recognition

Rodríguez¹,

Perronnin²

2013

Procedings of the British Machine Vision Conference 2013

View full text Add to dashboard Cite

The standard approach to recognizing text in images consists in first classifying local image regions into candidate characters and then combining them with high-level word models such as conditional random fields (CRF). This paper explores a new paradigm that departs from this bottom-up view. We propose to embed word labels and word images into a common Euclidean space. Given a word image to be recognized, the text recognition problem is cast as one of retrieval: find the closest word label in this space. This common space is learned using the Structured SVM (SSVM) framework by enforcing matching label-image pairs to be closer than non-matching pairs. This method presents the following advantages: it does not require costly pre-or post-processing operations, it allows for the recognition of never-seen-before words and the recognition process is efficient. Experiments are performed on two challenging datasets (one of license plates and one of scene text) and show that the proposed method is competitive with standard bottom-up approaches to text recognition.

show abstract

“…The image category is estimated via the majority voting on the decision of each region classifier. The region level representation is obtained by densely extracting the SURF descriptor over the region and then using the VLAD encoding defined in Jegou et al (2012).…”

Section: Methods Facing Taskmentioning

confidence: 99%

HEp-2 staining pattern recognition at cell and specimen levels: Datasets, algorithms and results

Hobson

Lovell

Percannella

et al. 2016

Pattern Recognition Letters

View full text Add to dashboard Cite

1 Research Highlights (Required)To create your highlights, please type the highlights against each \item command.It should be short collection of bullet points that convey the core findings of the article. It should include 3 to 5 bullet points (maximum 85 characters, including spaces, per bullet point.)• Updates the state of the art in HEp-2 cell and specimen image classification ABSTRACTThe Indirect Immunofluorescence (IIF) protocol applied on Human Epithelial type 2 (HEp-2) cells is the current gold standard for the Antinuclear Antibody (ANA) test. The formulation of the diagnosis requires the visual analysis of a patient's specimen under a fluorescence microscope in order to recognize the cells' staining pattern which could be related to a connective tissue disease. This analysis is time consuming and error prone, thus in the recent past we have witnessed a growing interest in the pattern recognition scientific community directed at the development of methods for supporting this complex task. The main driver of the interest towards this problem is represented by the series of international benchmarking initiatives organized in the last four years that allowed dozens of research groups to propose innovative methodologies for HEp-2 cells' staining pattern classification. In this paper we update the state of the art on HEp-2 cells and specimens classification, by analyzing the performance achieved by the methods participating the contest on Performance Evaluation of IIF Image Analysis Systems, hosted by the 22nd edition of the International Conference on Pattern Recognition ICPR 2014, and to the Executable Thematic Special Issue of Pattern Recognition Letters on Pattern Recognition Techniques for IIF Images Analysis, and by highlighting the trends in the design of the best performing methods.

show abstract

Aggregating Local Image Descriptors into Compact Codes

Cited by 1,427 publications

References 33 publications

Quantized embeddings: an efficient and universal nearest neighbor method for cloud-based image retrieval

Quantized embeddings: an efficient and universal nearest neighbor method for cloud-based image retrieval

Label embedding for text recognition

HEp-2 staining pattern recognition at cell and specimen levels: Datasets, algorithms and results

Contact Info

Product

Resources

About