An Empirical Evaluation of Current Convolutional Architectures’ Ability to Manage Nuisance Location and Scale Variability

Karianakis, Nikolaos; Dong, Jingming; Soatto, Stefano

doi:10.1109/cvpr.2016.481

Cited by 6 publications

(7 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Divvala et al [21] explored different types of context in recognition. See also [1,18,35,38,42,52,70,88].…”

Section: Contextual Influences In Object Detectionmentioning

confidence: 99%

Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors

Borji

2020

Preprint

View full text Add to dashboard Cite

Object detection remains as one of the most notorious open problems in computer vision. Despite large strides in accuracy in recent years, modern object detectors have started to saturate on popular benchmarks raising the question of how far we can reach with deep learning tools and tricks. Here, by employing 2 state-of-the-art object detection benchmarks, and analyzing more than 15 models over 4 large scale datasets, we I) carefully determine the upper bound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017), and 58.9% on OpenImages V4 (validation), regardless of the IOU threshold. These numbers are much better than the mAP of the best model (47.9% on VOC, and 46.9% on COCO; IOUs=.5:.05:.95), II) characterize the sources of errors in object detectors, in a novel and intuitive way, and find that classification error (confusion with other classes and misses) explains the largest fraction of errors and weighs more than localization and duplicate errors, and III) analyze the invariance properties of models when surrounding context of an object is removed, when an object is placed in an incongruent background, and when images are blurred or flipped vertically. We find that models generate a lot of boxes on empty regions and that context is more important for detecting small objects than larger ones. Our work taps into the tight relationship between object detection and object recognition and offers insights for building better models. Our code is publicly available at https://github.com/aliborji/Deetctionupperbound.git. * Work done while at MarkableAI. 1 The best published mAP (IOUs=.5:.95) on COCO2017 test-dev is 51.0 by EfficientDet [71]. See https://compet itions.codalab.org/competitions/20794#results for the latest results on the COCO dataset.

show abstract

“…Divvala et al [21] explored different types of context in recognition. See also [1,18,35,38,42,52,70,88].…”

Section: Contextual Influences In Object Detectionmentioning

confidence: 99%

Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors

Borji

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…forms of complex perturbations is tested, and state-of-the-art deep networks are shown once again to be unstable to these perturbations. An empirical analysis of the ability of current convolutional neural networks (CNNs) to manage location and scale variability is proposed in [21]. It is shown, in particular, that CNNs are not very effective in factoring out location and scale variability, despite the popular belief that the convolutional architecture and the local spatial pooling provides invariance to such representations.…”

Section: Robustness To Structured Transformationsmentioning

confidence: 99%

The Robustness of Deep Networks: A Geometrical Perspective

Fawzi

Moosavi-Dezfooli

Frossard

2017

IEEE Signal Process. Mag.

184

View full text Add to dashboard Cite

The Robustness of Deep Networks A geometrical perspective D eep neural networks have recently shown impressive classification performance on a diverse set of visual tasks. When deployed in real-world (noise-prone) environments, it is equally important that these classifiers satisfy robustness guarantees: small perturbations applied to the samples should not yield significant loss to the performance of the predictor. The goal of this article is to discuss the robustness of deep networks to a diverse set of perturbations that may affect the samples in practice, including adversarial perturbations, random noise, and geometric transformations. This article further discusses the recent works that build on the robustness analysis to provide geometric insights on the classifier's decision surface, which help in developing a better understanding of deep networks. Finally, we present recent solutions that attempt to increase the robustness of deep networks. We hope this review article will contribute to shed ding light on the open research challenges in the robustness of deep networks and stir interest in the analysis of their fundamental properties.

show abstract

“…However, it is difficult to interpret their results as they also have not taken into account the dependence between the different nuisance transformations. Karianakis et al [20] empirically study the influence of scale and location nuisances on the generalization ability of DCNNs at the task of object recognition and find that DCNNs can become invariant to these nuisances when learned from large datasets.…”

Section: Related Workmentioning

confidence: 99%

Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems

Kortylewski

Egger

Schneider

et al. 2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

It is unknown what kind of biases modern in the wild face datasets have because of their lack of annotation. A direct consequence of this is that total recognition rates alone only provide limited insight about the generalization ability of a Deep Convolutional Neural Networks (DCNNs). We propose to empirically study the effect of different types of dataset biases on the generalization ability of DCNNs. Using synthetically generated face images, we study the face recognition rate as a function of interpretable parameters such as face pose and light. The proposed method allows valuable details about the generalization performance of different DCNN architectures to be observed and compared.In our experiments, we find that: 1) Indeed, dataset bias has a significant influence on the generalization performance of DCNNs. 2) DCNNs can generalize surprisingly well to unseen illumination conditions and large sampling gaps in the pose variation. 3) Using the presented methodology we reveal that the VGG-16 architecture outperforms the AlexNet architecture at face recognition tasks because it can much better generalize to unseen face poses, although it has significantly more parameters. 4) We uncover a main limitation of current DCNN architectures, which is the difficulty to generalize when different identities to not share the same pose variation. 5) We demonstrate that our findings on synthetic data also apply when learning from real-world data. Our face image generator is publicly available to enable the community to benchmark other DCNN architectures.

show abstract

An Empirical Evaluation of Current Convolutional Architectures’ Ability to Manage Nuisance Location and Scale Variability

Cited by 6 publications

References 41 publications

Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors

Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object Detectors

The Robustness of Deep Networks: A Geometrical Perspective

Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems

Contact Info

Product

Resources

About