← Back to OpenAI updates ← Terug naar OpenAI-updates
OpenAI ARTICLE ARTIKEL 9 September 2025 9 september 2025

SafetyKit scales risk agents with OpenAI’s most capable models SafetyKit schaalt risicobots met OpenAI’s krachtigste modellen

From prototyping with early vision model previews to scaling with GPT‑5, SafetyKit’s multi-modal agents expand into new domains and boost accuracy. Van prototyping met vroege vision model-previews tot schalen met GPT‑5 breiden de multimodale agents van SafetyKit uit naar nieuwe domeinen en verbeteren ze de nauwkeurigheid.

Article details Artikelgegevens
AI maker AI-maker OpenAI Type Type Article Artikel Published Gepubliceerd 9 September 2025 9 september 2025 Updates Updates Videos Video's View original article Bekijk origineel artikel
Why it matters Waarom dit telt

Quick editorial signal Snelle redactionele duiding

5 min
Impact Impact

Relevant if you build with AI tools, APIs, or coding agents. Relevant als je bouwt met AI-tools, API's of coding agents.

Audience Voor wie Developers Developers
Level Niveau Expert Expert
  • Track this as a OpenAI update, not just a standalone headline. Bekijk dit als OpenAI-update, niet alleen als losse headline.
  • Useful for builders who need to understand API, coding, or workflow changes. Nuttig voor bouwers die API-, code- of workflowwijzigingen willen begrijpen.
  • Likely worth revisiting after people have used the release in practice. Waarschijnlijk de moeite waard om opnieuw te bekijken zodra mensen het in praktijk gebruiken.
model apps developers creative

Contact sales

Company size: Startup

Region: North America

Industry: Technology, Services

Products: API

Results

95%+

Accuracy reviewing 100% of customer content (SafetyKit evals)

16B

Tokens processed daily, up from 200M six months ago

10+

Point performance gains on hardest vision tasks with GPT-5

SafetyKit⁠(opens in a new window) builds multimodal AI agents to help marketplaces, payment platforms, and fintechs detect and act on fraud and prohibited activity across text, images, financial transactions, product listings, and more. Recent breakthroughs in model reasoning and multimodal understanding now make this more effective, setting a new bar for risk, compliance, and safety operations.

SafetyKit’s agents leverage GPT‑5, GPT‑4.1, deep research, and Computer Using Agent (CUA) to review 100% of customer content with over 95% accuracy based on SafetyKit’s evals. They can help platforms protect users, prevent fraud, avoid regulatory fines and enforce complex policies that legacy systems may miss, such as region-specific rules, embedded phone numbers in scam images, or explicit content. Automation can also protect human moderators from exposure to offensive material and frees them to handle nuanced policy decisions.

> “OpenAI gives us access to the most advanced reasoning and multimodal models on the market. It lets us adapt quickly, ship new agents faster, and handle content types other solutions can’t even parse.”

David Graunke, Founder and CEO of SafetyKit

Design agents for what the task demands, then choose the right model

SafetyKit’s agents are each built to handle a specific risk category, from scams to illegal products. Every piece of content is routed to the agent best suited for that violation, using the optimal OpenAI model:

David Graunke, Founder and CEO of SafetyKit

Design agents for what the task demands, then choose the right model

SafetyKit’s agents are each built to handle a specific risk category, from scams to illegal products. Every piece of content is routed to the agent best suited for that violation, using the optimal OpenAI model:

* GPT‑5 applies multimodal reasoning across text, images, and UI to surface hidden risks and support layered, precise decision-making

* GPT‑4.1 reliably follows detailed content-policy instructions and efficiently manages high-volume moderation workflows

This model-matching approach lets SafetyKit scale content review across modalities with more nuance and accuracy than legacy solutions can.

The Scam Detection agent, for example, goes beyond just scanning text. It analyzes visuals like QR codes or phone numbers embedded in product images. GPT‑4.1 helps it parse the image, understand the layout, and decide whether it is a policy violation.

The Policy Disclosure agent checks listings or landing pages for required language, such as legal disclaimers or region-specific compliance warnings. GPT‑4.1 extracts relevant sections, GPT‑5 evaluates compliance, and the agent flags violations.

“We think of our agents as purpose-built workflows,” says Graunke. “Some tasks require deep reasoning, others need multimodal context. OpenAI is the only stack that delivers reliable performance across both.”

Leverage GPT‑5 to navigate the gray areas and high-stakes decisions

Policy decisions often hinge on subtle distinctions. Take a marketplace requiring sellers to include a disclaimer for wellness products, with requirements varying based on product claims and regional rules. Legacy providers use keyword triggers or rigid rulesets, which can miss the deeper judgment calls these decisions may require, leading to missed or incorrect enforcement.

SafetyKit’s Policy Disclosure agent first references policies from SafetyKit’s internal library then GPT‑5 evaluates the content: does it mention treatment or prevention? Is it being sold in a region where disclosure is mandatory? And if so, is the required language actually included in the listing? If anything falls short, GPT‑5 returns a structured output the agent uses to flag the issue.

“The power of GPT‑5 is in how precisely it can reason when grounded in real policy,” notes Graunke. “It lets us make accurate, defensible decisions even in the edge cases where other systems fail.”

Turn every model release into a product win

SafetyKit benchmarks each new OpenAI model against its hardest cases, often deploying top performers the same day. Rigorous internal evaluations allow the team to quickly identify how new models can improve performance and seamlessly integrate into their core infrastructure.

When OpenAI o3 launched, SafetyKit used it to boost edge case performance across key policy areas. GPT‑5 followed, and within days, it was deployed across their most demanding agents, improving benchmark scores by more than 10 points on their toughest vision tasks.

> “OpenAI moves fast, and we’ve designed our system to keep up. Every new release gives us an operational edge–unlocking new capabilities and domains we couldn’t support before, and increasing the coverage and accuracy we deliver to customers.”

—David Graunke, Founder and CEO of SafetyKit

SafetyKit also feeds improvements back into the ecosystem, sharing eval results, edge case failures, and policy-specific insights directly with OpenAI to help shape future model performance for safety-critical workloads.

Scale customer and volume growth with the best OpenAI stack

SafetyKit’s architecture enforces policy at scale, delivering speed, precision, and comprehensive risk coverage. Behind the scenes, it now handles over 16 billion tokens daily, up from 200 million six months ago, analyzing more content without sacrificing accuracy.

In that same time, SafetyKit has expanded to payments risk, fraud, anti-child exploitation, anti-money laundering, and new customers with hundreds of millions of end users under SafetyKit’s protection. This foundation empowers customers to respond swiftly and confidently to emerging risks.

“We’ve created a loop where every OpenAI release directly strengthens our capabilities,” says Graunke. “That’s why the system continually improves, always staying ahead of evolving risks.”

Results at a glance

In that same time, SafetyKit has expanded to payments risk, fraud, anti-child exploitation, anti-money laundering, and new customers with hundreds of millions of end users under SafetyKit’s protection. This foundation empowers customers to respond swiftly and confidently to emerging risks.

“We’ve created a loop where every OpenAI release directly strengthens our capabilities,” says Graunke. “That’s why the system continually improves, always staying ahead of evolving risks.”

Results at a glance

Join the new era of work

More than 1 million businesses around the world are achieving meaningful results with OpenAI.

* 10+ point benchmark gains on hardest vision tasks

Join the new era of work

More than 1 million businesses around the world are achieving meaningful results with OpenAI.

Contact sales

Help shape what we cover next Help bepalen wat we hierna volgen

Anonymous feedback, no frontend account needed. Anonieme feedback, zonder front-end account.

More from OpenAI Meer van OpenAI

All updates Alle updates

Gemini komt eraan