PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications

Read paper(opens in a new window)(opens in a new window) Read paper(opens in a new window)(opens in a new window)

Article details Artikelgegevens

AI maker AI-maker OpenAI Type Type Article Artikel Published Gepubliceerd 19 January 2017 19 januari 2017 Updates Updates Videos Video's View original article Bekijk origineel artikel

Abstract

PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available atthis https URL⁠(opens in a new window). Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.

PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available atthis https URL⁠(opens in a new window). Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.

Authors

Tim Salimans, Andrej Karpathy, Xi Chen, Durk Kingma

View all

Hierarchical text-conditional image generation with CLIP latents Publication Apr 13, 2022

DALL·E: Creating images from text Milestone Jan 5, 2021

Image GPT Publication Jun 17, 2020

Share article Deel artikel

PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications

Tim Salimans, Andrej Karpathy, Xi Chen, Durk Kingma

View all

More from OpenAI Meer van OpenAI

Introducing GPT-5.5 GPT-5.5 geïntroduceerd

GPT-5.5 Bio Bug Bounty GPT-5.5 Bio Bug Bounty

How to get started with Codex Zo begin je met Codex

What is Codex? Wat is Codex?

PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications

Tim Salimans, Andrej Karpathy, Xi Chen, Durk Kingma

View all

More from OpenAI Meer van OpenAI

Introducing GPT-5.5 GPT-5.5 geïntroduceerd

GPT-5.5 Bio Bug Bounty GPT-5.5 Bio Bug Bounty

How to get started with Codex Zo begin je met Codex

What is Codex? Wat is Codex?

The Next Input keeps optional media off until you say yes. The Next Input houdt optionele media uit tot jij ja zegt.