← Back to OpenAI updates ← Terug naar OpenAI-updates
OpenAI ARTICLE ARTIKEL 19 January 2017 19 januari 2017

PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications

Read paper(opens in a new window)(opens in a new window) Read paper(opens in a new window)(opens in a new window)

Article details Artikelgegevens
AI maker AI-maker OpenAI Type Type Article Artikel Published Gepubliceerd 19 January 2017 19 januari 2017 Updates Updates Videos Video's View original article Bekijk origineel artikel

Abstract

PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available atthis https URL⁠(opens in a new window). Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.

PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available atthis https URL⁠(opens in a new window). Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.

Authors

Tim Salimans, Andrej Karpathy, Xi Chen, Durk Kingma

Related articles

View all

Hierarchical text-conditional image generation with CLIP latents Publication Apr 13, 2022

DALL·E: Creating images from text Milestone Jan 5, 2021

Image GPT Publication Jun 17, 2020

Image GPT Publication Jun 17, 2020

More from OpenAI Meer van OpenAI

All updates Alle updates

Gemini komt eraan