Lucas Beyer scite author profile

Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks. A large number of the pretext tasks for self-supervised learning have been studied, but other important aspects, such as the choice of convolutional neural networks (CNN), has not received equal attention. Therefore, we revisit numerous previously proposed self-supervised models, conduct a thorough large scale study and, as a result, uncover multiple crucial insights. We challenge a number of common practices in selfsupervised visual representation learning and observe that standard recipes for CNN design do not always translate to self-supervised representation learning. As part of our study, we drastically boost the performance of previously proposed techniques and outperform previously published state-of-the-art results by a large margin.

show abstract

S4L: Self-Supervised Semi-Supervised Learning

Beyer

et al. 2019

View full text Add to dashboard Cite

Big Transfer (BiT): General Visual Representation Learning

Kolesnikov¹,

Beyer²,

Zhai³

et al. 2019

Preprint

148

View full text Add to dashboard Cite

The STRANDS Project: Long-Term Autonomy in Everyday Environments

Hawes

Burbridge

Jovan

et al. 2017

IEEE Robot. Automat. Mag.

161

129

View full text Add to dashboard Cite

Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots, and deploying these systems for long-term installations in security and care environments. Over four deployments, our robots have been operational for a combined duration of 104 days autonomously performing end-user defined tasks, covering 116km in the process. In this article we describe the approach we have used to enable long-term autonomous operation in everyday environments, and how our robots are able to use their long run times to improve their own performance

show abstract

SPENCER: A Socially Aware Service Robot for Passenger Guidance and Help in Busy Airports

et al. 2016

View full text Add to dashboard Cite

We present an ample description of a socially compliant mobile robotic platform, which is developed in the EU-funded project SPENCER. The purpose of this robot is to assist, inform and guide passengers in large and busy airports. One particular aim is to bring travellers of connecting flights conveniently and efficiently from their arrival gate to the passport control. The uniqueness of the project stems from the strong demand of service robots for this application with a large potential impact for the aviation industry on one side, and on the other side from the scientific advancements in social robotics, brought forward and achieved in SPENCER. The main contributions of SPENCER are novel methods to perceive, learn, and model human social behavior and to use this knowledge to plan appropriate actions in realtime for mobile platforms. In this paper, we describe how the project advances the fields of detection and tracking of individuals and groups, recognition of human social relations and activities, normative human behavior learning, socially-aware task and motion planning, learning socially annotated maps, and conducting empirical experiments to assess socio-psychological effects of normative robot behaviors.

show abstract

LiT: Zero-Shot Transfer with Locked-image text Tuning

et al. 2022

View full text Add to dashboard Cite

Scaling Vision Transformers

Zhai¹,

Kolesnikov²,

Houlsby³

et al. 2021

Preprint

View full text Add to dashboard Cite

Attention-based neural networks such as the Vision Transformer (ViT) have recently attained state-of-the-art results on many computer vision benchmarks. Scale is a primary ingredient in attaining excellent results, therefore, understanding a model's scaling properties is a key to designing future generations effectively. While the laws for scaling Transformer language models have been studied, it is unknown how Vision Transformers scale. To address this, we scale ViT models and data, both up and down, and characterize the relationships between error rate, data, and compute. Along the way, we refine the architecture and training of ViT, reducing memory consumption and increasing accuracy of the resulting models. As a result, we successfully train a ViT model with two billion parameters, which attains a new state-of-the-art on ImageNet of 90.45% top-1 accuracy. The model also performs well on few-shot learning, for example, reaching 84.86% top-1 accuracy on ImageNet with only 10 examples per class.Preprint. * equal contribution.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lucas Beyer

Big Transfer (BiT): General Visual Representation Learning

Revisiting Self-Supervised Visual Representation Learning

S4L: Self-Supervised Semi-Supervised Learning

Big Transfer (BiT): General Visual Representation Learning

The STRANDS Project: Long-Term Autonomy in Everyday Environments

SPENCER: A Socially Aware Service Robot for Passenger Guidance and Help in Busy Airports

LiT: Zero-Shot Transfer with Locked-image text Tuning

Scaling Vision Transformers

Contact Info

Product

Resources

About