The power of continuous learning

The power of continuous learning | OpenAI

Lilian Weng works on Applied AI Research at OpenAI.

Listen to article

Artificial general intelligence (AGI) should outperform humans at most economically valuable work. I’m looking forward to seeing AGI help human society in these ways:

1. Fully automate or significantly reduce human efforts on tasks that are repetitive and non-innovative. In other words, AGI should drastically boost human productivity.

2. Greatly expedite the discovery of new scientific breakthroughs, including but not limited to facilitating human decision making process by providing additional analyses and information.

3. Understand and interact with the physical world effectively, efficiently and safely.

During my first 2.5 years at OpenAI, I worked on the Robotics team on a moonshot idea: we wanted to teach a single, human-like robot hand to solve Rubik’s cube. It was a tremendously exciting, challenging, and emotional experience. We solved⁠ the challenge with deep reinforcement learning (RL), crazy amounts of domain randomization, and no real-world training data. More importantly, we conquered the challenge as a team.

From simulation and RL training to vision perception and hardware firmware, we collaborated so closely and cohesively. It was an amazing experiment and during that time, I often thought of Steve Jobs’ reality distortion field⁠(opens in a new window): when you believe in something so strongly and keep on pushing it so persistently, somehow you can make the impossible possible.

Since the beginning of 2021, I started leading the Applied AI Research team. Managing a team presents a different set of challenges and requires working style changes. I’m most proud of several projects related to language model safety within Applied AI:

1. We designed and constructed a set of evaluation data and tasks to assess the tendency of pre-trained language models to generate hateful, sexual, or violent content.

2. We created a detailed taxonomy and built a strong classifier to detect unwanted content⁠ as well as the reason why the content is inappropriate.

3. We are working on various techniques to make the model less likely to generate unsafe outputs.

As the Applied AI team is practicing the best way to deploy cutting-edge AI techniques, such as large pre-trained language models, we see how powerful and useful they are for real-world tasks. We are also aware of the importance of safely deploying the techniques, as emphasized in our Charter⁠.

Photo: Jake Stangel

Current deep learning models are not perfect. They are trained with a gigantic amount of data created by humans (e.g., on the Internet, curated, and literature) and unavoidably absorb a lot of flaws and biases that long exist in our society. For example, when DALL·E was asked to portray a nurse, it would only generate female characters, or for a professor, it would only generate white people. The model captures biases in real world statistics or biases in our training data.

I was motivated to design a method to mitigate this type of social bias and evaluate how efficient the method is. With the team, we designed a pipeline to reduce such bias as well as a workflow to run human-in-the-loop evaluation. Reducing social bias is not an easy problem, since it appears in many aspects of our lives and sometimes can be hard to notice. But I’m glad the DALL·E team treats the problem seriously and takes actions at a very early stage. What we have right now is just a start and we will keep making progress. I’m proud to work in this area and glad to see how, step by step, we are making modern AI safer and better.

> “Ideas in different topics or fields can often inspire new ideas and broaden the potential solution space.”

I believe in the power of learning and it is never too late to learn. Maintaining my personal blog is a good way to keep this curiosity going and learn about new progress in the deep learning community regularly. I also encourage my team to keep on learning, whether related or unrelated to their current projects. Ideas in different topics or fields can often inspire new ideas and broaden the potential solution space.

I’m also a strong believer in teamwork. If everyone shines in their best strength, we will get 1+1 > 2. Meanwhile, we might often run into “dirty” work and personally I’m very willing to take on those tasks, because as long as that’s the biggest blocker or that task can add the biggest value into the project, nothing should be considered “dirty” or “trivial.” I encourage people around me to do the same, being a team player and working together to expedite the team productivity.

It all starts as a set of personal learning notes. I didn’t enter the deep learning field super early and still considered myself a “newbie.” Initially as I started digging into so many papers, I was amazed by the concept of not designing an algorithm to solve a problem, but training a model to learn the algorithm to solve a problem. The more I read the more curious I become. Practically it became so difficult to organize all the papers I’ve read and new concepts I’ve learned. So I decided to start a blog to document and organize my learning notes. I also believe that the best way to learn something is to make sure you can teach others the knowledge correctly and clearly. Writing helps me get there.

I was not expecting it to become popular in the ML community, but whenever I got a thank you email or was told in person that they have learned a lot from reading my blog, I feel so honored and grateful. It has been almost 6 years since I started the blog in 2017 and I will keep it going as long as I can.

The AI community has made so much progress in recent years. The advancement in hardware, model architecture and data makes it possible to train gigantic models and, as a result, we keep seeing greater and greater capacities. I believe we are on the right track towards AGI, but scaling is not the only recipe. In my opinion the most urgent challenges right now are alignment and safety. To some extent, they may be the same issue about controllability or steerability.

First, even if we’ve already had an extremely powerful AI system in hand, if we cannot efficiently communicate our goals and make sure the model is aligned with what we want, it would not be possible to create as much value as we need. The current most powerful model learns from a gigantic amount of data and the dataset unavoidably captures imperfect flaws and biases in the real world. On this front, misaligned models carry safety concerns, as they are not aware of what should be avoided.

> “I believe we are on the right track towards AGI, but scaling is not the only recipe. The most urgent challenges right now are alignment and safety.”

This is not a particular piece of advice that someone gave me, but is based on my experience at OpenAI so far. That is, to think big. We are creating something new and we should be ambitious, brave, and take on enough persistence to carry on the efforts.

Books. I usually read books outside of the deep learning field and got inspired by a variety of fields; For example, how critical it is for a writer to be persistent in 50 years, for a surgeon to be perfectly detail-oriented, and for an entrepreneur to have “crazy ideas.”

People around me. I’m honored to work with a large group of extremely talented colleagues at OpenAI. Everyone has something sparkling, inspiring, or respectful and I enjoy learning from them.

The power of continuous learning

More from ChatGPT

New usage analytics and updated spend controls for enterprises

Improving health intelligence in ChatGPT

Using AI to help physicians diagnose rare genetic diseases affecting children

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

Comments

The power of continuous learning

More from ChatGPT

New usage analytics and updated spend controls for enterprises

Improving health intelligence in ChatGPT

Using AI to help physicians diagnose rare genetic diseases affecting children

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

Comments

The Next Input keeps optional media off until you say yes.