2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00247
|View full text |Cite
|
Sign up to set email alerts
|

Memory Based Online Learning of Deep Representations from Video Streams

Abstract: We present a novel online unsupervised method for face identity learning from video streams. The method exploits deep face descriptors together with a memory based learning mechanism that takes advantage of the temporal coherence of visual data. Specifically, we introduce a discriminative feature matching solution based on Reverse Nearest Neighbour and a feature forgetting strategy that detect redundant features and discard them appropriately while time progresses. It is shown that the proposed learning proced… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 28 publications
(20 citation statements)
references
References 56 publications
0
20
0
Order By: Relevance
“…Sharma et al [67] used instead a Recurrent Rolling Convolution (RRC) CNN [68] and a SubCNN [69] to detect vehicles in videos recorded on a moving camera in the context of autonomous driving (see section 3.2.4). Pernici et al [70] used the Tiny CNN detector [71] in their face tracking algorithm, obtaining a better performance when compared to the Deformable Parts Model detector (DPM) [25], that does not use deep learning techniques.…”
Section: Other Detectorsmentioning
confidence: 99%
See 1 more Smart Citation
“…Sharma et al [67] used instead a Recurrent Rolling Convolution (RRC) CNN [68] and a SubCNN [69] to detect vehicles in videos recorded on a moving camera in the context of autonomous driving (see section 3.2.4). Pernici et al [70] used the Tiny CNN detector [71] in their face tracking algorithm, obtaining a better performance when compared to the Deformable Parts Model detector (DPM) [25], that does not use deep learning techniques.…”
Section: Other Detectorsmentioning
confidence: 99%
“…The authors in [100] used a fine-tuned GoogLeNet on the ILSVRC CLS-LOC [101] dataset for pedestrians recognition. In [70], the authors reused the visual features extracted by the CNN-based detector, and the association was performed using a Reverse Nearest Neighbor technique [102]. Sheng et al [103] employed the convolutional part of GoogLeNet to extract appearance features, using the cosine distance between them to compute an affinity score between pairs of detections, and merging that information with motion prediction in order to compute an overall affinity which serves as edge cost in a graph problem.…”
Section: Cnns As Visual Feature Extractorsmentioning
confidence: 99%
“…Their approach extends Nearest Class Mean classifier to operate in an open world setting by re-calibrating the class probabilities to balance open space risk. [46] studies open world face identity learning while [63] proposed to use an exemplar set of seen classes to match them against a new sample, and rejects it in case of a low match with all previously known classes. However, they don't test on image classification benchmarks and study product classification in e-commerce applications.…”
Section: Related Workmentioning
confidence: 99%
“…However, having redundant visual exemplars not only slows down the tracker, but also makes the tracker become biased and eventually drift away from the target. Therefore, we adopt the reverse nearest neighbor algorithm [22,32] and add Z t to Z if the reverse nearest neighbor set of Z t with Z is an empty set. The rationale is that we add Z t to Z only if the new exemplar "looks" different to its past, and therefore the memory captures the temporal appearance variations of the target.…”
Section: Memory Management Mechanism (Mmm)mentioning
confidence: 99%