← Back to OpenAI updates ← Terug naar OpenAI-updates
OpenAI ARTICLE ARTIKEL 25 February 2016 25 februari 2016

Weight normalization: A simple reparameterization to accelerate training of deep neural networks Weight normalization: A simple reparameterization to accelerate training of deep neural networks

Read paper(opens in a new window) Read paper(opens in a new window)

Article details Artikelgegevens
AI maker AI-maker OpenAI Type Type Article Artikel Published Gepubliceerd 25 February 2016 25 februari 2016 Updates Updates Videos Video's View original article Bekijk origineel artikel

Abstract

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

Authors

Tim Salimans, Durk Kingma

Related articles

View all

Embedding AI into developer software Mar 21, 2024

Building a data-driven, efficient culture with AI Mar 18, 2024

Reimagining the email experience with AI Mar 18, 2024

Reimagining the email experience with AI Mar 18, 2024

More from OpenAI Meer van OpenAI

All updates Alle updates

Gemini komt eraan