2014 IEEE International Conference on Image Processing (ICIP) 2014
DOI: 10.1109/icip.2014.7025068
|View full text |Cite
|
Sign up to set email alerts
|

A data-driven approach to cleaning large face datasets

Abstract: Large face datasets are important for advancing face recognition research, but they are tedious to build, because a lot of work has to go into cleaning the huge amount of raw data. To facilitate this task, we describe an approach to building face datasets that starts with detecting faces in images returned from searches for public figures on the Internet, followed by discarding those not belonging to each queried person.We formulate the problem of identifying the faces to be removed as a quadratic programming … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
300
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 533 publications
(301 citation statements)
references
References 15 publications
1
300
0
Order By: Relevance
“…The FaceScrub data set [71] contains 107, 818 images of celebrities automatically collected from the web, and verified using a semi-automated process. It contains 530 different individuals, with an average of approximately 200 images per person.…”
Section: The Facescrub and Casia Data Setsmentioning
confidence: 99%
“…The FaceScrub data set [71] contains 107, 818 images of celebrities automatically collected from the web, and verified using a semi-automated process. It contains 530 different individuals, with an average of approximately 200 images per person.…”
Section: The Facescrub and Casia Data Setsmentioning
confidence: 99%
“…This caused a shift of focus to methods that are independent of image storage and compression limitations. Some of the well-known recent databases collect images from videos in the internet and they represent a wide variation in image storage and quality [15] [25][26][27][28]. This motivates further research into scalable cloud-based methods that can extract features from large databases and correlate them with facial recognition tasks.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…From the year 2000 and onwards, the facial databases were seen to capture the variations in pose, lighting, imaging angles, ethnicity, gender and facial expressions [4]. Some of the most recent databases capture the variations in image sizes, compression, occlusions and are gathered from varied sources such as social media and internet [15].…”
Section: Facial Recognition Databasesmentioning
confidence: 99%
“…To detect a face/hand we employ a modified HOG (Histogram of Oriented Gradients) descriptor combined with responses of complex cells and a linear SVM to code the shape. The face and hand detectors were trained and evaluated on the FaceScrub dataset [13] and the Oxford hand dataset [14], respectively. The developed HRI system does not need any prior calibration and has been designed to run in real time.…”
Section: IImentioning
confidence: 99%
“…Finally, non-maximum suppresion is applied to eliminate multiple detections of the same face/hand (see below). We used the FaceScrub dataset [13] and the Oxford Hand dataset [14] to train and evaluate our face and hand detectors. For each detector we train an initial classifier using the positive and a random set of negative examples, then we use it to scan over images not containing faces or hands and collect false positives, and then we do a second round of training by including these hard false positives into the negative training set.…”
Section: Face and Hand Detectionmentioning
confidence: 99%