Particular object retrieval with integral max-pooling of CNN activations

Tolias, Giorgos; Sicre, Ronan

doi:10.48550/arxiv.1511.05879

Cited by 122 publications

(220 citation statements)

References 39 publications

(66 reference statements)

Supporting

Mentioning

218

Contrasting

Order By: Relevance

“…global) features is particularly favoured in practice due to its run-time efficiency. In the era of deep learning, global features can be generated by aggregating CNNs features (Babenko and Lempitsky 2015;Tolias, Sicre, and Jégou 2015;Arandjelovic et al 2016;Gordo et al 2016;Radenović, Tolias, and Chum 2016;Tolias, Avrithis, and Jégou 2016;Noh et al 2017;Radenović, Tolias, and Chum 2018). Apart from global features, local features are also used to perform spatial verification (Philbin et al 2007;Noh et al 2017;Cao, Araujo, and Sim 2020), which incorporates the geometric information of objects and results in a more reliable matching.…”

Section: Related Workmentioning

confidence: 99%

InsCLR: Improving Instance Retrieval with Self-Supervision

Deng¹,

Zhong²,

Guo³

et al. 2021

Preprint

View full text Add to dashboard Cite

This work aims at improving instance retrieval with selfsupervision. We find that fine-tuning using the recently developed self-supervised (SSL) learning methods, such as Sim-CLR and MoCo, fails to improve the performance of instance retrieval. In this work, we identify that the learnt representations for instance retrieval should be invariant to large variations in viewpoint and background etc., whereas self-augmented positives applied by the current SSL methods can not provide strong enough signals for learning robust instance-level representations. To overcome this problem, we propose InsCLR, a new SSL method that builds on the instance-level contrast, to learn the intra-class invariance by dynamically mining meaningful pseudo positive samples from both mini-batches and a memory bank during training. Extensive experiments demonstrate that InsCLR achieves similar or even better performance than the state-ofthe-art SSL methods on instance retrieval. Code is available at https://github.com/zeludeng/insclr.

show abstract

Section: Related Workmentioning

confidence: 99%

InsCLR: Improving Instance Retrieval with Self-Supervision

Deng¹,

Zhong²,

Guo³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Much like other applications [18] [19], the use of convolutional neural networks (CNNs), convolution auto-encoders (CAEs) and deep/shallow neural nets has demonstrated superior results for VPR than those achieved by handcrafted feature-based approaches. The use of neural networks was studied in [20] where a pre-trained CNN was used to extract features from layers of an input image.…”

Section: Literature Review a Vpr Techniquesmentioning

confidence: 99%

“…The design of CNN layer activations is also something that has been extensively studied, specifically pooling approaches that are employed on convolution layers. Such pooling approaches include; Max-Pooling [18], Cross-Pooling [19], Sum-Pooling [25] and Spatial Max-Pooling [26].…”

Section: Literature Review a Vpr Techniquesmentioning

confidence: 99%

A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

Power,

Zaffar,

Ferrarini

et al. 2021

Preprint

View full text Add to dashboard Cite

Visual Place Recognition (VPR) has been a subject of significant research over the last 15 to 20 years. VPR is a fundamental task for autonomous navigation as it enables self-localization within an environment. Although robots are often equipped with resource-constrained hardware, the computational requirements of and effects on VPR techniques have received little attention. In this work, we present a hardwarefocused benchmark evaluation of a number of state-of-the-art VPR techniques on public datasets. We consider popular single board computers, including ODroid, UP and Raspberry Pi 3, in addition to a commodity desktop and laptop for reference. We present our analysis based on several key metrics, including place-matching accuracy, image encoding time, descriptor matching time and memory needs. Key questions addressed include: (1) How does the performance accuracy of a VPR technique change with processor architecture? (2) How does power consumption vary for different VPR techniques and embedded platforms? (3) How much does descriptor size matter in comparison to today's embedded platforms' storage? (4) How does the performance of a high-end platform relate to an on-board low-end embedded platform for VPR? The extensive analysis and results in this work serve not only as a benchmark for the VPR community, but also provide useful insights for real-world adoption of VPR applications.

show abstract

“…The traditional hand-crafted features suffer large memory size consumption and will lower the searching efficiency [2,4]. Many compact image representations [5,6,7,8,9] are thus proposed to address the above problem. Among these studies, the deep features can produce better performance than the traditional Fisher compact features (such as work [5] and [6]), and they have become the predominant stream of features used for CBIR.…”

Section: Introductionmentioning

confidence: 99%

“…Among these studies, the deep features can produce better performance than the traditional Fisher compact features (such as work [5] and [6]), and they have become the predominant stream of features used for CBIR. Generally, the Convolutional Neural Networks (CNN) layer activations are directly adopted as the off-theshelf deep features [7,8,9]. However, these features are usually trained for image classification tasks.…”

Section: Introductionmentioning

confidence: 99%

Construct Informative Triplet with Two-stage Hard-sample Generation

Zhu¹,

Dong²,

He³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this paper, we propose a robust sample generation scheme to construct informative triplets. The proposed hard sample generation is a two-stage synthesis framework that produces hard samples through effective positive and negative sample generators in two stages, respectively. The first stage stretches the anchor-positive pairs with piecewise linear manipulation and enhances the quality of generated samples by skillfully designing a conditional generative adversarial network to lower the risk of mode collapse. The second stage utilizes an adaptive reverse metric constraint to generate the final hard samples. Extensive experiments on several benchmark datasets verify that our method achieves superior performance than the existing hard-sample generation algorithms. Besides, we also find that our proposed hard sample generation method combining the existing triplet mining strategies can further boost the deep metric learning performance.

show abstract

Particular object retrieval with integral max-pooling of CNN activations

Cited by 122 publications

References 39 publications

InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision

A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

Construct Informative Triplet with Two-stage Hard-sample Generation

Contact Info

Product

Resources

About