The spatial features of recent crises in a developing country: analysing regional economic resilience for the Brazilian case

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and imagetext alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, GLIDE and DALL-E 2, and find that human raters prefer Imagen over other models in side-byside comparisons, both in terms of sample quality and image-text alignment. See imagen.research.google for an overview of the results. * Equal contribution. † Core contribution.

show abstract

The Impact of Acquisitions on Operating Performance: Some Australian Evidence

Sharma

2002

Business Fin & Account

162

214

View full text Add to dashboard Cite

This study investigates the impact of acquisitions on the operating performance of Australian firms. For a sample of 36 Australian acquisitions occurring between 1986 to 1991 inclusive, and using matched firms to control for industry and economy-wide factors, the results based on four accrual and four cash flow performance measures show that corporate acquisitions do not lead to significant improvements in post-acquisition operating performance. The consistency of the results with the agency, the hubris and the financial motivation hypotheses suggests that corporate acquisitions in Australia may be undertaken for other than synergistic reasons. The results assist in explaining inconsistent findings reported in the literature. Copyright Blackwell Publishers Ltd 2002.

show abstract

The effects of product-related, personal-related factors and attractiveness of alternatives on consumer adoption of NFC-based mobile payments

Pham

2015

Technology in Society

217

197

View full text Add to dashboard Cite

Classifier-Free Diffusion Guidance

Ho¹,

Salimans²

2022

Preprint

111

153

View full text Add to dashboard Cite

Classifier guidance is a recently introduced method to trade off mode coverage and sample fidelity in conditional diffusion models post training, in the same spirit as low temperature sampling or truncation in other types of generative models. Classifier guidance combines the score estimate of a diffusion model with the gradient of an image classifier and thereby requires training an image classifier separate from the diffusion model. It also raises the question of whether guidance can be performed without a classifier. We show that guidance can be indeed performed by a pure generative model without such a classifier: in what we call classifier-free guidance, we jointly train a conditional and an unconditional diffusion model, and we combine the resulting conditional and unconditional score estimates to attain a trade-off between sample quality and diversity similar to that obtained using classifier guidance.

show abstract

Image Super-Resolution via Iterative Refinement

Saharia¹,

Ho²,

Chan³

et al. 2021

Preprint

130

View full text Add to dashboard Cite

Factors affecting the behavioral intention to adopt mobile banking: An international comparison

Lee

et al. 2020

Technology in Society

127

View full text Add to dashboard Cite

Video Diffusion Models

Ho¹,

Salimans²,

Gritsenko³

et al. 2022

Preprint

View full text Add to dashboard Cite

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial results. Our model is a natural extension of the standard image diffusion architecture, and it enables jointly training from image and video data, which we find to reduce the variance of minibatch gradients and speed up optimization. To generate long and higher resolution videos we introduce a new conditional sampling technique for spatial and temporal video extension that performs better than previously proposed methods. We present the first results on a large text-conditioned video generation task, as well as state-of-the-art results on an established unconditional video generation benchmark. Supplementary material is available at https://video-diffusion.github.io/.

show abstract

Palette: Image-to-Image Diffusion Models

Saharia¹,

Chan²,

Chang³

et al. 2021

Preprint

View full text Add to dashboard Cite

We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. On four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG decompression), Palette outperforms strong GAN and regression baselines, and establishes a new state of the art. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of using L 2 vs. L 1 loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, and report several sample quality scores including FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against reference images for various baselines. We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a single generalist Palette model trained on 3 tasks (colorization, inpainting, JPEG decompression) performs as well or better than task-specific specialist counterparts. Check out https://bit.ly/palette-diffusion for more details.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jonathan Ho

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

The Impact of Acquisitions on Operating Performance: Some Australian Evidence

The effects of product-related, personal-related factors and attractiveness of alternatives on consumer adoption of NFC-based mobile payments

Classifier-Free Diffusion Guidance

Image Super-Resolution via Iterative Refinement

Factors affecting the behavioral intention to adopt mobile banking: An international comparison

Video Diffusion Models

Palette: Image-to-Image Diffusion Models

Contact Info

Product

Resources

About