Mehdi Cherti scite author profile

Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classification with noteworthy out-of-distribution robustness. Since then, large-scale language-vision models like ALIGN, BASIC, GLIDE, Flamingo and Imagen made further improvements. Studying the training and capabilities of such models requires datasets containing billions of image-text pairs. Until now, no datasets of this size have been made openly available for the broader research community. To address this problem and democratize research on large-scale multi-modal models, we present LAION-5B -a dataset consisting of 5.85 billion CLIP-filtered image-text pairs, of which 2.32B contain English language. We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale. Additionally we provide several nearest neighbor indices, an improved web-interface for dataset exploration and subset generation, and detection scores for watermark, NSFW, and toxic content detection. 1 1 Project page: https://laion.ai/laion-5b-a-new-era-of-open-large-scale-multi-modal-datasets/

show abstract

Scikit-Optimize/Scikit-Optimize: V0.5.2

Head¹,

MechCoder²,

Louppe³

et al. 2018

View full text Add to dashboard Cite

Reproducible scaling laws for contrastive language-image learning

Cherti¹,

Beaumont²,

Wightman³

et al. 2022

Preprint

View full text Add to dashboard Cite

Out-of-Class Novelty Generation : An Experimental Foundation

Cherti¹,

Kégl

Kazakçı

2017

View full text Add to dashboard Cite

Constructive machine learning aims at finding one or more instances of a domain which will exhibit some desired properties. Such a process bears a strong similarity with a design process where the ultimate objective is the generation of previously unknown and novel objects by using knowledge about known objects. The aim of the present work is to bring ideas from design theory to machine learning and elaborate an experimental procedure allowing the study of design through machine learning approaches. To this end, we propose an actionable definition of creativity as the generation of out-of-distribution novelty. We assess several metrics designed for evaluating the quality of generative models on this new task. Through extensive experiments on various types of generative models, we find architectures and hyperparameter combinations which lead to out-of-distribution novelty. Such generators can then be used to search a semantically richer and broader space than standard generative models would allow. * This paper is an adapted version of our submission to ICLR17, available here.30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.

show abstract

Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images

Cherti¹,

Jitsev²

2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mehdi Cherti

LAION-5B: An open large-scale dataset for training next generation image-text models

Scikit-Optimize/Scikit-Optimize: V0.5.2

Reproducible scaling laws for contrastive language-image learning

Out-of-Class Novelty Generation : An Experimental Foundation

Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images

Contact Info

Product

Resources

About