Building agents with the OpenAI Agents SDK — a field guide.

The OpenAI Agents SDK is OpenAI's first-party answer to "how do I build a multi-agent app on OpenAI?" It replaced their earlier experimental SDK (Swarm) in early 2025 and ships in both Python (openai-agents) and TypeScript (@openai/agents). The framework's self-stated design principle: "Enough features to be worth using, but few enough primitives to make it quick to learn."

That posture shows up in the API. Where LangGraph gives you a graph builder and ADK gives you four agent types, OpenAI gives you exactly five concepts: agents, runners, tools, handoffs, and guardrails. That's the whole framework.

The mental model: agents that hand off

An Agent is an LLM with instructions, a list of tools, and a list of handoffs (other agents it can transfer the conversation to). A Runner drives the agent loop: the model picks a tool or a handoff, the runtime executes, the result feeds back, and the loop continues until the agent produces a final output.

The killer feature is the handoff model. From the docs: "Handoffs are represented as tools to the LLM. So if there's a handoff to an agent named Refund Agent, the tool would be called transfer_to_refund_agent." The LLM doesn't need to know it's "handing off" — it just calls a tool. The framework handles the rest.

Where ADK gives you composition primitives and asks you to wire them up, OpenAI gives you handoffs and asks you to write good agent descriptions. The handoffs find their own way.

The five primitives

Agent — name, instructions, tools, handoffs, model, optional output_type for structured output.
Runner — drives the loop. Runner.run(agent, input) async, Runner.run_sync(...) blocking, Runner.run_streamed(...) for streaming events.
function_tool — decorator that turns any Python or TypeScript function into a tool. Name and description are inferred from the function and its docstring.
handoff() — helper for customizing transfers (override tool name, intercept the handoff, filter the input passed to the next agent).
@input_guardrail / @output_guardrail — decorators that gate agent input or output. A tripwire raises and stops the run.

Tools: yours and OpenAI's

Custom tools are simple — decorate a function with @function_tool and pass it via tools=[...]. The model gets a tool with the function's name, the docstring as description, and a JSON schema derived from the type signature.

Where the SDK gets distinctive is the hosted tools — tools OpenAI runs in their infrastructure that you pass to the agent like any other:

WebSearchTool() — web search
FileSearchTool(vector_store_ids=[...]) — RAG over OpenAI Vector Stores
CodeInterpreterTool — sandboxed Python
ComputerTool — computer use (browser/desktop control)
ImageGenerationTool — image generation
HostedMCPTool — remote MCP servers

These require the Responses API. You don't implement them, you just enable them. For teams committed to OpenAI's stack, this is the highest-leverage feature — no building your own browser-control tool, no managing your own vector store.

A worked example: a triage agent with three handoffs

The canonical multi-agent shape: a triage agent that reads an incoming request and hands off to a specialist. Below: a customer-support setup with three specialists (billing, refund, technical) and a guardrail that blocks off-topic requests at the front door.

Figure 1 · Triage + handoffs. Every step is auto-traced.

The code, in 18 lines

from agents import Agent, Runner, function_tool, input_guardrail

@function_tool
def lookup_invoice(invoice_id: str) -> str:
    "Return the invoice details for the given ID."
    ...

@input_guardrail
async def is_support_request(ctx, agent, input):
    # ...returns GuardrailFunctionOutput(tripwire_triggered=...)
    ...

billing = Agent(name="Billing agent", tools=[lookup_invoice])
refund  = Agent(name="Refund agent", tools=[issue_refund])
tech    = Agent(name="Technical agent", tools=[search_docs])

triage = Agent(
    name="Triage agent",
    instructions="Route the customer to the right specialist.",
    handoffs=[billing, refund, tech],
    input_guardrails=[is_support_request])

result = await Runner.run(triage, "I want a refund for invoice 123")

That's a complete multi-agent customer-support application. The guardrail rejects off-topic requests before they reach the model. The triage agent routes via handoff. The chosen specialist runs its tool. The result comes back. Tracing happens automatically — open platform.openai.com/traces and the entire run is there, agent-by-agent, tool-by-tool.

Tracing: the unfair advantage

Most agent frameworks bolt observability on as a separate product. OpenAI's SDK ships with tracing on by default, flowing to OpenAI's Traces dashboard, with no setup. Custom spans via trace() and custom_span() let you wrap business operations and see them inline with model calls.

The dashboard shows every agent, every handoff, every tool call, every token, every guardrail decision, with timings. For an agency client, this is the difference between "we think the agent is doing X" and "here's the screenshot of it doing X." If you're selling agent outcomes, that distinction is the whole job.

Voice agents and the realtime API

The SDK has first-class support for voice via OpenAI's realtime models. You build a voice agent the same way you build a text one — same Agent, same tools, same handoffs — and the runtime handles streaming audio in and out. For voice-first products (kiosks, call centers, accessibility tools), this is meaningful. The SDK is the only major framework with voice as a peer to text.

Where it sits next to the others

OpenAI Agents SDK — fastest to ship if you're committed to OpenAI's stack. Hosted tools and tracing are real differentiators. Less flexible than LangGraph, less framework-y than ADK.
LangGraph — the right answer when your problem is a graph or you need durable execution.
Google ADK — the right answer when you need explicit workflow agents (sequential / parallel / loop) as first-class citizens.
Claude Agent SDK — the right answer when one well-instrumented agent with strong hooks carries the task.

Where on-device fits

OpenAI doesn't ship on-device models. The Agents SDK runs server-side, full stop. The hybrid pattern — on-device UX layer, OpenAI Agents SDK behind it for heavy orchestration — works the same way as it does with the other cloud frameworks. The thing the OpenAI SDK uniquely buys you in this hybrid is the hosted tools (web search, computer use, code interpreter) — capabilities that'd be very hard to ship on-device and that you'd want behind a secure cloud boundary anyway.

What to do with this

If your team is already OpenAI-shaped — using GPT models, Vector Stores, the Responses API — the Agents SDK is the smallest step to a real multi-agent product. The five primitives are learnable in an afternoon, the hosted tools cover most enterprise needs, and tracing is included.

The honest tradeoff: you trade flexibility for speed. If you need durable execution, time-travel debugging, or heterogeneous models, you will outgrow this SDK. That's fine — start here, graduate to LangGraph if and when the constraints bite.