Yujie Zhong scite author profile

The objective of this paper is to learn a compact representation of image sets for template-based face recognition. We make the following contributions: first, we propose a network architecture which aggregates and embeds the face descriptors produced by deep convolutional neural networks into a compact fixed-length representation. This compact representation requires minimal memory storage and enables efficient similarity computation. Second, we propose a novel GhostVLAD layer that includes ghost clusters, that do not contribute to the aggregation. We show that a quality weighting on the input faces emerges automatically such that informative images contribute more than those with low quality, and that the ghost clusters enhance the network's ability to deal with poor quality images. Third, we explore how input feature dimension, number of clusters and different training techniques affect the recognition performance. Given this analysis, we train a network that far exceeds the state-of-the-art on the IJB-B face recognition dataset. This is currently one of the most challenging public benchmarks, and we surpass the state-of-the-art on both the identification and verification protocols.

show abstract

Exploring Classification Equilibrium in Long-Tailed Object Detection

Feng

Zhong²,

Huang

2021

View full text Add to dashboard Cite

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

Feng¹,

Zhong²,

Jie³

et al. 2022

View full text Add to dashboard Cite

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

Chen

Cao²,

Zhong³

et al. 2022

View full text Add to dashboard Cite

Faces in Places: compound query retrieval

Zhong¹,

Arandjelović²,

Zisserman³

2016

View full text Add to dashboard Cite

Compact Deep Aggregation for Set Retrieval

Zhong

Arandjelović

Zisserman

2019

View full text Add to dashboard Cite

The objective of this work is to learn a compact embedding of a set of descriptors that is suitable for efficient retrieval and ranking, whilst maintaining discriminability of the individual descriptors. We focus on a specific example of this general problem -that of retrieving images containing multiple faces from a large scale dataset of images. Here the set consists of the face descriptors in each image, and given a query for multiple identities, the goal is then to retrieve, in order, images which contain all the identities, all but one, etc.To this end, we make the following contributions: first, we propose a CNN architecture -SetNet -to achieve the objective: it learns face descriptors and their aggregation over a set to produce a compact fixed length descriptor designed for set retrieval, and the score of an image is a count of the number of identities that match the query; second, we show that this compact descriptor has minimal loss of discriminability up to two faces per image, and degrades slowly after that -far exceeding a number of baselines; third, we explore the speed vs. retrieval quality trade-off for set retrieval using this compact descriptor; and, finally, we collect and annotate a large dataset of images containing various number of celebrities, which we use for evaluation and is publicly released.

show abstract

GhostVLAD for set-based face recognition

Zhong¹,

Arandjelović²,

Zisserman³

2018

Preprint

View full text Add to dashboard Cite

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yujie Zhong

TOOD: Task-aligned One-stage Object Detection

GhostVLAD for Set-Based Face Recognition

Exploring Classification Equilibrium in Long-Tailed Object Detection

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

Faces in Places: compound query retrieval

Compact Deep Aggregation for Set Retrieval

GhostVLAD for set-based face recognition

Contact Info

Product

Resources

About