OpenAI updates OpenAI-updates
Browse every published OpenAI update in a calm card overview with images, dates, and direct access to each article. Bekijk alle gepubliceerde OpenAI-updates in een rustig kaartenoverzicht met beelden, datums en directe toegang tot elk artikel.
OpenAI update OpenAI-update
Dota 2 with large scale deep reinforcement learning Dota 2 with large scale deep reinforcement learning
Read paper(opens in a new window) Read paper(opens in a new window)
OpenAI update OpenAI-update
Deep double descent Deep double descent
Read paper(opens in a new window) Read paper(opens in a new window)
OpenAI update OpenAI-update
Procgen Benchmark Procgen Benchmark
Procgen Benchmark consists of 16 unique environments designed to measure both sample efficiency and generalization in reinforcement learning. This benchmark is ideal for evaluating generalization since distinct training and test sets can be generated in each environment. This benchmark is also well-suited to evaluate sample efficiency, since all environments pose diverse and compelling challenges for RL agents. The environments’ intrinsic diversity demands that agents learn robust policies; overfitting to narrow regions in state space will not suffice. Put differently, the ability to generalize becomes an integral component of success when agents are faced with ever-changing levels. Procgen Benchmark consists of 16 unique environments designed to measure both sample efficiency and generalization in reinforcement learning. This benchmark is ideal for evaluating generalization since distinct training and test sets can be generated in each environment. This benchmark is also well-suited to evaluate sample efficiency, since all environments pose diverse and compelling challenges for RL agents. The environments’ intrinsic diversity demands that agents learn robust policies; overfitting to narrow regions in state space will not suffice. Put differently, the ability to generalize becomes an integral component of success when agents are faced with ever-changing levels.
OpenAI update OpenAI-update
Safety Gym Safety Gym
To study constrained RL for safe exploration, we developed a new set of environments and tools called Safety Gym. By comparison to existing environments for constrained RL, Safety Gym environments are richer and feature a wider range of difficulty and complexity. To study constrained RL for safe exploration, we developed a new set of environments and tools called Safety Gym. By comparison to existing environments for constrained RL, Safety Gym environments are richer and feature a wider range of difficulty and complexity.
OpenAI update OpenAI-update
Benchmarking safe exploration in deep reinforcement learning Benchmarking safe exploration in deep reinforcement learning
Read paper(opens in a new window) Read paper(opens in a new window)
OpenAI update OpenAI-update
GPT-2: 1.5B release GPT-2: 1.5B release
Title: GPT-2: 1.5B release Title: GPT-2: 1.5B release
OpenAI update OpenAI-update
Solving Rubik’s Cube with a robot hand Solving Rubik’s Cube with a robot hand
Read paper(opens in a new window)Watch all videos(opens in a new window) Read paper(opens in a new window)Watch all videos(opens in a new window)
OpenAI update OpenAI-update
OpenAI Scholars 2020: Applications open OpenAI Scholars 2020: Applications open
We are now accepting applications for our third class of OpenAI Scholars. We are now accepting applications for our third class of OpenAI Scholars.
OpenAI update OpenAI-update
Fine-tuning GPT-2 from human preferences Fine-tuning GPT-2 from human preferences
Title: Fine-tuning GPT-2 from human preferences Title: Fine-tuning GPT-2 from human preferences
OpenAI update OpenAI-update
Emergent tool use from multi-agent interaction Emergent tool use from multi-agent interaction
Title: Emergent tool use from multi-agent interaction Title: Emergent tool use from multi-agent interaction
OpenAI update OpenAI-update
Testing robustness against unforeseen adversaries Testing robustness against unforeseen adversaries
We’ve developed a method to assess whether a neural network classifier can reliably defend against adversarial attacks not seen during training. Our method yields a new metric, UAR (Unforeseen Attack Robustness), which evaluates the robustness of a single model against an unanticipated attack, and highlights the need to measure performance across a more diverse range of unforeseen attacks. We’ve developed a method to assess whether a neural network classifier can reliably defend against adversarial attacks not seen during training. Our method yields a new metric, UAR (Unforeseen Attack Robustness), which evaluates the robustness of a single model against an unanticipated attack, and highlights the need to measure performance across a more diverse range of unforeseen attacks.
OpenAI update OpenAI-update
GPT-2: 6-month follow-up GPT-2: 6-month follow-up
We’re releasing the 774 million parameter GPT-2 language model after the release of our small 124M model in February, staged release of our medium 355M model in May, and subsequent research with partners and the AI community into the model’s potential for misuse and societal benefit. We’re also releasing an open-source legal agreement to make it easier for organizations to initiate model-sharing partnerships with each other, and are publishing a technical report about our experience in coordinating with the wider AI research community on publication norms. We’re releasing the 774 million parameter GPT-2 language model after the release of our small 124M model in February, staged release of our medium 355M model in May, and subsequent research with partners and the AI community into the model’s potential for misuse and societal benefit. We’re also releasing an open-source legal agreement to make it easier for organizations to initiate model-sharing partnerships with each other, and are publishing a technical report about our experience in coordinating with the wider AI research community on publication norms.
Showing 769 to 780 of 919 updates. Je bekijkt 769 tot 780 van 919 updates.