← Back to FLUX updates ← Terug naar FLUX-updates
FLUX ARTICLE ARTIKEL 24 February 2026 24 februari 2026

Capable, Open, and Safe: Combating AI Misuse Capable, Open, and Safe: Combating AI Misuse

Today, Black Forest Labs’ FLUX models are among the most popular AI models for visual generation. We are excited to share early results that help validate our efforts to mitigate emerging risks. Today, Black Forest Labs’ FLUX models are among the most popular AI models for visual generation. We are excited to share early results that help validate our efforts to mitigate emerging risks.

Capable, Open, and Safe: Combating AI Misuse
Article details Artikelgegevens
AI maker AI-maker FLUX Type Type Article Artikel Published Gepubliceerd 24 February 2026 24 februari 2026 Updates Updates Videos Video's View original article Bekijk origineel artikel

Black Forest Labs is pushing the frontier of visual intelligence. Our team has released some of the most capable and most popular AI models globally. As we grow, we're taking steps to combat misuse, protect the community, and show that high performance, open innovation, and sensible safeguards go hand in hand. Today, we're excited to share early results that help validate our mitigations for emerging risks—including synthetic non-consensual intimate imagery (NCII) and child sexual abuse material (CSAM)—and will help us strengthen these safeguards in the future.

On third-party evaluations, our latest FLUX.2 model family demonstrates >10 times fewer vulnerabilities for serious risks than other leading open-weight image models, including those from large technology firms.

Our targeted post-training mitigations help to reduce vulnerabilities by 77-98% prior to release.

Alongside other safeguards, these mitigations can meaningfully reduce the risk of widespread misuse. For example, industry-standard moderation practices during deployment can eliminate most, if not all, residual vulnerabilities for NCII and CSAM.

On third-party evaluations, our latest FLUX.2 model family demonstrates >10 times fewer vulnerabilities for serious risks than other leading open-weight image models, including those from large technology firms.

Our targeted post-training mitigations help to reduce vulnerabilities by 77-98% prior to release.

Alongside other safeguards, these mitigations can meaningfully reduce the risk of widespread misuse. For example, industry-standard moderation practices during deployment can eliminate most, if not all, residual vulnerabilities for NCII and CSAM.

Black Forest Labs is committed to open innovation

Black Forest Labs is committed to open research and development as the bedrock for competition, innovation, and security in AI. By sharing our research breakthroughs openly, we can help to accelerate the discovery of new techniques, methods, and architectures. By sharing our models openly, we enable developers around the world to build new tools that we can scarcely imagine today—from kitchen table startups to the world’s largest enterprises. To date, our founding team has contributed three of the four most popular open-weight AI models on Hugging Face, totaling over 400 million downloads. Today, our FLUX family leads the most capable open-weight image models.

We face unique challenges

Pre-training. We filter pre-training data for multiple categories of nude and pornographic material and known CSAM. Limiting exposure to this data in the first place can help prevent a user generating unlawful content, whether by eliciting harmful features in the data, or by combining lawful features into an unlawful composite image. We have partnered with the Internet Watch Foundation, an independent nonprofit organization dedicated to preventing online abuse, to filter known CSAM from the training data.

Post-training. We undertake multiple rounds of targeted fine-tuning to provide additional mitigation against potential abuse, spanning both text-to-image (T2I) and image-to-image (I2I) attacks. By suppressing certain concepts in the trained model, these techniques can help prevent a user generating synthetic NCII or CSAM from a text prompt, or transforming an uploaded image into synthetic NCII or CSAM.

Deployment. We release our most capable open-weight models with enforceable licenses that prohibit unlawful or infringing misuse, and require the use of filters during inference. With our open-weight models, we provide filters to help deployers detect violative or infringing activity. On our hosted services, we implement filters for a range of content types—including sexual content, hate, violence and gore, and representations of self-harm—and maintain a reporting relationship with the U.S. National Center for Missing and Exploited Children. Additionally, we apply content provenance metadata on our hosted services to help platforms and viewers identify AI-generated content once it is shared online. We include links to the Coalition for Content Provenance and Authenticity (C2PA) in our open-weight repositories to help other developers implement this metadata.

After deployment. We subsequently monitor for patterns of violative use in both our hosted services and the open developer community. We issue and escalate takedown requests to websites, services, or businesses that misuse our models. Additionally, we may ban users or developers we detect violating our policies. We provide a dedicated email hotline to solicit feedback from the community, and welcome ongoing engagement with authorities, developers, and researchers to share intelligence about emerging risks and effective mitigations.

We take our responsibility to mitigate emerging risks seriously. The properties that make open-weight models useful can also pose a unique challenge for risk management. These models can be deployed independently without oversight from the original developer and without appropriate safeguards, in some cases using consumer hardware. They can be modified or integrated with other systems for unauthorized purposes. If a vulnerability is discovered after release, it is not possible to fully withdraw all copies of the affected model.

We respond with layers of mitigation

While there are no silver bullets to prevent all misuse, layers of mitigation can help to prevent widespread misuse. Before each release, we evaluate a number of risks, including the production of unlawful content, with a focus on synthetic NCII and CSAM. We implement a series of pre-release mitigations in our models to help prevent misuse, with other post-release safeguards to address residual vulnerabilities. These mitigations incorporate best practices outlined by nonprofit organizations such as Thorn as well as agencies such as the US National Institute of Standards and Technology and the UK Office of Communications.

Directly attempt to elicit violative content;

Obscure a violative request in an otherwise benign context;

Obfuscate intent through indirect language, “l33t”, or scrambling;

Attempt to construct a violative image by assembling features that are individually nonviolative; and

Request analogous visual features or substitutes in place of violative terms.

Pre-training. We filter pre-training data for multiple categories of nude and pornographic material and known CSAM. Limiting exposure to this data in the first place can help prevent a user generating unlawful content, whether by eliciting harmful features in the data, or by combining lawful features into an unlawful composite image. We have partnered with the Internet Watch Foundation, an independent nonprofit organization dedicated to preventing online abuse, to filter known CSAM from the training data.

Undress, simulate, or reimagine an otherwise clothed individual;

De-age an individual;

Splice together multiple individuals to produce a violative composite figure; and

Merge multiple images into a violative scene.

Post-training. We undertake multiple rounds of targeted fine-tuning to provide additional mitigation against potential abuse, spanning both text-to-image (T2I) and image-to-image (I2I) attacks. By suppressing certain concepts in the trained model, these techniques can help prevent a user generating synthetic NCII or CSAM from a text prompt, or transforming an uploaded image into synthetic NCII or CSAM.

Deployment. We release our most capable open-weight models with enforceable licenses that prohibit unlawful or infringing misuse, and require the use of filters during inference. With our open-weight models, we provide filters to help deployers detect violative or infringing activity. On our hosted services, we implement filters for a range of content types—including sexual content, hate, violence and gore, and representations of self-harm—and maintain a reporting relationship with the U.S. National Center for Missing and Exploited Children. Additionally, we apply content provenance metadata on our hosted services to help platforms and viewers identify AI-generated content once it is shared online. We include links to the Coalition for Content Provenance and Authenticity (C2PA) in our open-weight repositories to help other developers implement this metadata.

After deployment. We subsequently monitor for patterns of violative use in both our hosted services and the open developer community. We issue and escalate takedown requests to websites, services, or businesses that misuse our models. Additionally, we may ban users or developers we detect violating our policies. We provide a dedicated email hotline to solicit feedback from the community, and welcome ongoing engagement with authorities, developers, and researchers to share intelligence about emerging risks and effective mitigations.

Comparative risk. Our five models demonstrated over 10 times fewer vulnerabilities than other popular open-weight models, indicating a higher robustness to misuse. These include models recently developed or funded by large technology firms with substantial resources, such as Alibaba, Tencent, and ByteDance.

Progressive mitigation. Our post-training mitigations yielded a 77-98 percent reduction in vulnerabilities compared to earlier checkpoints. Importantly, our most lightweight and efficient [klein] models—those likely to experience the widest adoption for local inference—demonstrated the fewest vulnerabilities.

Residual vulnerabilities. Subsequent evaluations with Cinder suggested that residual vulnerabilities can be nearly eliminated in deployment through the adoption of industry-standard moderation practices.

Throughout the development lifecycle, we conduct multiple internal and external evaluations to identify further opportunities for mitigation. For our latest open-weight model family, FLUX.2, we partnered with Cinder to conduct third-party red-teaming prior to each of our five open-weight model releases. These included FLUX.2 [dev]—a 32 billion parameter model based on rectified flow transformer architecture that enables high-quality image generation and editing—and FLUX.2 [klein], a derivative series of four size-distilled and step-distilled models, ranging from 4 to 9 billion parameters, optimized for local deployment, faster inference, and improved photorealism.

Third-party testing informed our release decision

Prior to release, we tasked Cinder to evaluate these models throughout their development lifecycle, including early, intermediate, and final checkpoints. These evaluations focused on identifying CSAM and NCII vulnerabilities across a range of T2I and I2I attacks. By observing the models’ behavior before, during, and after fine-tuning and distillation, we could better refine our mitigation strategy and make a considered release decision. We also instructed Cinder to run the same evaluation on leading open-weight models from other firms to help assess the marginal risk of our releases compared to the baseline.Attacks included prompts that:

Directly attempt to elicit violative content;

Obscure a violative request in an otherwise benign context;

Obfuscate intent through indirect language, “l33t”, or scrambling;

Attempt to construct a violative image by assembling features that are individually nonviolative; and

More from FLUX Meer van FLUX

All updates Alle updates
FLUX.2 [klein]: Towards Interactive Visual Intelligence
FLUX
15 Jan 2026 15 jan. 2026

FLUX.2 [klein]: Towards Interactive Visual Intelligence FLUX.2 [klein]: Towards Interactive Visual Intelligence

Introducing FLUX.2 [klein], our fastest image models. Unified generation and editing in under a second, on hardware you already own. Apache 2.0 on 4B. Introducing FLUX.2 [klein], our fastest image models. Unified generation and editing in under a second, on hardware you already own. Apache 2.0 on 4B.

Open article → Open artikel →
Laying the Foundations for Visual Intelligence—Our $300M Series B
FLUX
1 Dec 2025 1 dec. 2025

Laying the Foundations for Visual Intelligence—Our $300M Series B Laying the Foundations for Visual Intelligence—Our $300M Series B

Today, we're excited to announce our Series B of $300M at a $3.25B post-money valuation. Today, we're excited to announce our Series B of $300M at a $3.25B post-money valuation.

Open article → Open artikel →
FLUX.2: Frontier Visual Intelligence
FLUX
25 Nov 2025 25 nov. 2025

FLUX.2: Frontier Visual Intelligence FLUX.2: Frontier Visual Intelligence

Today, we release FLUX.2, our most capable model to date. Today, we release FLUX.2, our most capable model to date.

Open article → Open artikel →
FLUX.1 Kontext now in Adobe Photoshop: Powering Every Pixel
FLUX
25 Sep 2025 25 sep. 2025

FLUX.1 Kontext now in Adobe Photoshop: Powering Every Pixel FLUX.1 Kontext now in Adobe Photoshop: Powering Every Pixel

By pairing Photoshop’s professional editing environment with FLUX.1 Kontext [Pro]’s accuracy and unmatched speed, creators everywhere gain the freedom to push imagination further. By pairing Photoshop’s professional editing environment with FLUX.1 Kontext [Pro]’s accuracy and unmatched speed, creators everywhere gain the freedom to push imagination further.

Open article → Open artikel →

Gemini komt eraan