2016
DOI: 10.1007/978-3-319-46487-9_6
|View full text |Cite
|
Sign up to set email alerts
|

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

Abstract: In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base. More specifically, we propose a benchmark task to recognize one million celebrities from their face images, by using all the possibly collected face images of this individual on the web as training data. The rich information provided by the knowledge base helps to conduct disambiguation and improve the recognition accuracy, and contributes to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
1,154
0
5

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,398 publications
(1,161 citation statements)
references
References 17 publications
1
1,154
0
5
Order By: Relevance
“…4. To increase the retrieval difficulty, random 355k distractor images are sampled from the MS-Celeb-1M Dataset (Guo et al, 2016), as before taking care to include only true distractor people. The sampled distractor sets are constructed such that the number of faces per set follows the same distribution as in the Celebrity Together dataset.…”
Section: Evaluating On the Celebrity Together Datasetmentioning
confidence: 99%
“…4. To increase the retrieval difficulty, random 355k distractor images are sampled from the MS-Celeb-1M Dataset (Guo et al, 2016), as before taking care to include only true distractor people. The sampled distractor sets are constructed such that the number of faces per set follows the same distribution as in the Celebrity Together dataset.…”
Section: Evaluating On the Celebrity Together Datasetmentioning
confidence: 99%
“…The Sub-MS-Celeb dataset is rebuilt from MS-Celeb [12] dataset contains 87139 face images from 2589 classes after removing the dirty face images and non-frontal face images, and it is chosen as the source domain. All images are aligned and cropped to 64*64 pixels according to five landmarks: two eyes, nose and mouth corners (see Fig.3).…”
Section: Datasets and Preprocessingmentioning
confidence: 99%
“…The dataset is constructed by Microsoft and is available for noncommercial use. [9] further describes the process of assembling the images and the metric used for the choice of the 100K celebrity provided in the dataset. We used the whole dataset for the training of our neural network.…”
Section: ) Frgcmentioning
confidence: 99%
“…The training datasets that were used in this work are the FRGC dataset (because it is a relatively big dataset at the time that it was introduced), and the MS-celeb-1M [9] (because this is to our knowledge among the biggest publicly available datasets). More details about these databases are given below.…”
Section: Training Datasetsmentioning
confidence: 99%
See 1 more Smart Citation