2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.237
|View full text |Cite
|
Sign up to set email alerts
|

Deep Metric Learning via Facility Location

Abstract: Learning image similarity metrics in an end-to-end fashion with deep networks has demonstrated excellent results on tasks such as clustering and retrieval. However, current methods, all focus on a very local view of the data. In this paper, we propose a new metric learning scheme, based on structured prediction, that is aware of the global structure of the embedding space, and which is designed to optimize a clustering quality metric (NMI). We show state of the art performance on standard datasets, such as CUB… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
205
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 262 publications
(208 citation statements)
references
References 22 publications
(60 reference statements)
3
205
0
Order By: Relevance
“…Cars196 contains 16,185 images belonging to 196 classes of cars. In our experiments, we follow the settings in [3], taking the first 98 classes (8, [38].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Cars196 contains 16,185 images belonging to 196 classes of cars. In our experiments, we follow the settings in [3], taking the first 98 classes (8, [38].…”
Section: Methodsmentioning
confidence: 99%
“…(3) N-Pair [7] trains DML with N-pair loss. (4) Clustering [38] is a structured prediction based DML model which can be optimized with clustering quality metric. (5) Contrastive [19] uses contrastive loss for DML training.…”
Section: B Evaluation Metrics and Compared Methodsmentioning
confidence: 99%
“…The superscript denotes the embedding size. In [24] Song et al claim the results in the N-pair [23] paper have been achieved by an average of ten extracted embeddings from ten random crops. The usage of such a crop averaging technique is marked with .…”
Section: Comparison To the State-of-the-artmentioning
confidence: 99%
“…Not all listed approaches employ the GoogLeNet architecture [26]. A ResNet50 v2 [8] with a top-1 accuracy of 75.6% on the ImageNet validation set [20] is used by Margin and InceptionBN [11] with 73.9% by Proxy-NCA [18] and Clustering [24]. Compared to the GoogLeNet, the two more advanced architectures might give a better general image retrieval performance.…”
Section: Comparison To the State-of-the-artmentioning
confidence: 99%
See 1 more Smart Citation