Consensus uses GPT-5 and the Responses API to complete weeks of research in minutes

Consensus uses GPT-5 and the Responses API to complete weeks of research in minutes | OpenAI

[](https://openai.com/)

* Research

* Products

* Business

* Developers

* Company

* Foundation(opens in a new window)

Log inTry ChatGPT(opens in a new window)

Try ChatGPT(opens in a new window)Login

OpenAI

Table of contents

* Developers

* Company

* Foundation(opens in a new window)

Try ChatGPT(opens in a new window)Login

October 23, 2025

Consensus uses GPT‑5 and the Responses API to complete weeks of research in minutes

Using GPT‑5 and the Responses API, Consensus designed a multi-agent system that plans, reads, and synthesizes evidence the way researchers do.

Every year, millions of new scientific papers are published—far more than any one person can read.

For scientists, the challenge isn’t access to knowledge but the overwhelming task of finding, interpreting, and connecting it. Breakthroughs happen at the edge of what’s known, yet researchers spend most of their time just finding the edges instead of pushing past them.

Consensus⁠(opens in a new window), a research assistant used by more than 8 million people, was built to change that. Founded by Christian Salem and Eric Olson, the platform searches, reads, and synthesizes peer-reviewed literature across more than 220 million papers. Its newest capability, Scholar Agent, is a multi-agent system built on GPT‑5 and the Responses API. It mirrors how researchers actually work, helping them get from question to conclusion in minutes instead of weeks.

But the goal isn’t just faster research—it’s a faster path to discovery. “Science advances when it’s more accessible,” Salem says. “Our job is to give researchers everywhere the ability to find, trust, and act on evidence.”

From search engine to agentic assistant

The first version of Consensus worked like a vertical search engine for science: it indexed academic papers, retrieved relevant results, and generated summaries grounded in citations. But search alone wasn’t enough.

“Research isn’t just finding papers,” Salem says. “It’s interpreting results, comparing findings, and connecting ideas. The more time scientists spend searching, reading, and interpreting past knowledge for the right study, the less time they have to discover and createdo real research.”

So the team began re-architecting Consensus around a new concept: a multi-agent system called “Scholar Agent” that works the way a human researcher does.

Built on GPT‑5 and the Responses API, the system now runs a coordinated workflow of agents:

From search engine to agentic assistant

Each agent has a narrow scope, which keeps reasoning precise and minimizes hallucinations. The architecture also allows Consensus to decide when _not_ to answer; if no relevant studies meet its quality threshold, the assistant simply says so.

“By dividing the workflow across agents, we reduce error and make the system far more disciplined,” Salem says. “No one agent has too much responsibility, which turns out to be key for reliability.”

This approach is what the team calls context engineering: assembling the right evidence _before_ generation begins. Every answer comes with a “research context pack”—a structured bundle of papers, metadata, and key findings that trace back to original studies.

“We don’t want researchers wasting time double-checking every claim,” Salem says. “If the system can’t ground an answer in real evidence, it won’t make one up.”

Building with Responses API

Consensus migrated from Chat Completions to the Responses API to support its multi-agent routing. The switch improved both reliability and cost efficiency, giving the team finer control over sub-agent calls. With GPT‑5 long-context reasoning and reliable tool-calling, the choice was clear.

Early evaluations confirmed the bet: GPT‑5 outperformed GPT‑4.1, Sonnet 4, and Gemini 2.5 Pro on tool-calling accuracy and planning stability. That allowed the Consensus team to focus less on prompt gymnastics and more on building agent behaviors that map directly to research workflows.

A consumer bet in a world of institutions

From the beginning, Consensus approached the market differently than expected. Rather than selling through institutions, the team focused on the people doing the research itself: students, faculty, and clinicians who needed answers today. That direct-to-researcher focus shaped both the product’s design and its rapid growth.

“Everyone said you can’t go direct-to-consumer in academia, but AI has changed that,” Salem says. “People don’t wait for approval anymore—they use what works.”

That decision shaped the product’s tone and growth curve. Consensus feels more like a modern consumer app than a traditional academic tool: fast onboarding, intuitive design, conversational interface. Adoption spread through word of mouth across campuses and labs.

Graduate students and PhD candidates became the first power users, followed by faculty and private researchers. Then came clinicians, who began using Consensus to surface the latest evidence in their fields.

“We didn’t set out to build for doctors,” Salem says. “But they need the same thing researchers do: fast access to reliable evidence.”

The company recently signed the Mayo Clinic’s medical library and just launched ‘Medical Mode,’ a new feature designed for practitioners searching for clinical evidence.

Scaling with the science

In the past year, Consensus has expanded rapidly, growing to more than 8 million researchers worldwide and increasing revenue by 8x.

That growth hasn’t changed the product’s priorities. Every feature still revolves around verifiable, low-hallucination answers. The team has invested heavily in evaluation pipelines that test accuracy, citation traceability, and stylistic consistency across agents.

Consensus’s architecture is intentionally modular, designed so new agents can slot in as models expand and improve—agents that replicate experiments, generate figures, or run statistical analyses.

“We’re building the assistant researchers actually need in a rapidly changing world,” Salem says. “The models keep getting better, the system grows with them, and science moves faster.”

Join the communityStart building(opens in a new window)

The company recently signed the Mayo Clinic’s medical library and just launched ‘Medical Mode,’ a new feature designed for practitioners searching for clinical evidence.

Scaling with the science

In the past year, Consensus has expanded rapidly, growing to more than 8 million researchers worldwide and increasing revenue by 8x.

“We’re building the assistant researchers actually need in a rapidly changing world,” Salem says. “The models keep getting better, the system grows with them, and science moves faster.”

OpenAI

Join the communityStart building(opens in a new window)

Consensus uses GPT-5 and the Responses API to complete weeks of research in minutes

From search engine to agentic assistant

Graduate students and PhD candidates became the first power users, followed by faculty and private researchers. Then came clinicians, who began using Consensus to surface the latest evidence in their fields.

Scaling with the science

Scaling with the science

OpenAI

More from ChatGPT

New usage analytics and updated spend controls for enterprises

Improving health intelligence in ChatGPT

Using AI to help physicians diagnose rare genetic diseases affecting children

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

Comments

Consensus uses GPT-5 and the Responses API to complete weeks of research in minutes

From search engine to agentic assistant

Graduate students and PhD candidates became the first power users, followed by faculty and private researchers. Then came clinicians, who began using Consensus to surface the latest evidence in their fields.

Scaling with the science

Scaling with the science

OpenAI

More from ChatGPT

New usage analytics and updated spend controls for enterprises

Improving health intelligence in ChatGPT

Using AI to help physicians diagnose rare genetic diseases affecting children

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

Comments

The Next Input keeps optional media off until you say yes.