← All posts

Claude Managed Agents: A Practical Playbook

Anthropic now runs the harness so you don't have to. Here's the concrete path from API key to a production agent.

April 14, 2026 8 min read Guide
TL;DR

Claude Managed Agents ships the sandbox, session state, tool execution and event stream as a hosted service. You describe the agent, Anthropic runs it. Four concepts (Agent, Environment, Session, Events), a ten-minute quick start, and patterns for outcomes and multiagent delegation.

If you've built agents on the raw Messages API, you already know the tax: your own tool-call loop, your own sandbox, your own retry logic, your own secrets store. Months of plumbing before the agent does anything interesting. Claude Managed Agents deletes that list.

You describe the agent. Anthropic runs it — secure containers, long-lived sessions, built-in code execution, streaming events. This post is the short version of what it is, when to use it, and how to ship your first one today.

Messages API vs Managed Agents

They solve different problems. Pick based on how much control you actually need.

Two deeper reasons to prefer the managed path. First, self-written harnesses bake in assumptions about what the model can't do — those assumptions go stale with every release. Managed Agents updates the harness for you. Second, task horizons are growing fast. Anthropic expects future Claude versions to work for days or weeks on a single task. That demands fault-tolerant infrastructure you probably don't want to maintain.

Four concepts that run the whole system

Learn these four and the rest follows.

Access. Managed Agents is in open beta. All endpoints require the managed-agents-2026-04-01 header (the SDK adds it automatically). Outcomes, Multiagent, and Memory are separate research previews — request them individually.

Your first agent in ten minutes

Install

# CLI
brew install anthropics/tap/ant

# SDK
pip install anthropic          # Python
npm install @anthropic-ai/sdk  # TypeScript

export ANTHROPIC_API_KEY="your-api-key-here"

If you already use Claude Code, there's a faster path: run claude, then /claude-api managed-agents-onboarding. The built-in skill walks you through everything below.

Create the agent

from anthropic import Anthropic

client = Anthropic()

agent = client.beta.agents.create(
    name="Coding Assistant",
    model="claude-sonnet-4-6",
    system="You are a helpful coding assistant.",
    tools=[{"type": "agent_toolset_20260401"}],
)

agent_toolset_20260401 bundles bash, read, write, edit, glob, grep, web_fetch, and web_search. One line, full toolbox.

Create an environment and start a session

env = client.beta.environments.create(
    name="dev-env",
    config={"type": "cloud", "networking": {"type": "unrestricted"}},
)

session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=env.id,
    title="My first session",
)

Send a task and stream the result

with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(
        session.id,
        events=[{
            "type": "user.message",
            "content": [{"type": "text",
                         "text": "Generate the first 20 Fibonacci "
                                 "numbers and save to fibonacci.txt"}],
        }],
    )
    for event in stream:
        if event.type == "session.status_idle":
            break

Under the hood: a container spins up, Claude picks tools, calls execute in the sandbox, results stream back. No Docker config, no tool dispatcher, no recovery logic on your side.

Built-in tools and how to scope them

The toolset gives you eight capabilities by default. You'll usually want to narrow it down for production agents.

Disable specific tools:

{
  "type": "agent_toolset_20260401",
  "configs": [
    {"name": "web_fetch", "enabled": false},
    {"name": "web_search", "enabled": false}
  ]
}

Or flip the default and enable only what you need:

{
  "type": "agent_toolset_20260401",
  "default_config": {"enabled": false},
  "configs": [
    {"name": "bash", "enabled": true},
    {"name": "read", "enabled": true},
    {"name": "write", "enabled": true}
  ]
}

Custom tools slot in alongside the built-ins. Four rules that matter more than anything else:

Permissions: always_allow vs always_ask

Two modes, and you can mix them per tool.

A common shape: reads and web search auto, bash gated. This is one of the things that makes Managed Agents more production-ready than LangGraph, CrewAI, or AutoGen — none of them ship permissions out of the box.

Usage patterns to steal

Deploy pattern that works. Keep agent templates as YAML in git. Apply them with the CLI from your deploy pipeline. Run sessions with the SDK at runtime. Config lives in version control; runtime stays lightweight.

Outcomes: turn conversations into jobs

Outcomes (research preview) promote a session from chat to job. You supply a rubric. The system spins up a separate grader in its own context — unbiased by the main agent's decisions — and scores each iteration. The agent keeps working until it passes or hits max_iterations.

Rubric rule: concrete and checkable wins.

client.beta.sessions.events.send(
    session_id=session.id,
    events=[{
        "type": "user.define_outcome",
        "description": "Build a DCF model for Costco in .xlsx",
        "rubric": {"type": "text", "content": RUBRIC},
        "max_iterations": 5,
    }],
)

Outputs land in /mnt/session/outputs/. Pull them via the Files API once the session reports satisfied.

Multiagent delegation

A coordinator agent can call other agents listed in its callable_agents. Each runs in its own thread with isolated context, but they share one container and filesystem. Useful for code review with read-only tools, test generation in a sandboxed agent, or research agents with web access.

One limit: delegation is one level deep. The coordinator calls sub-agents; sub-agents can't chain further. Design accordingly.

Architecture in one paragraph

Anthropic deliberately split the system into three replaceable pieces: the brain (Claude plus harness), the hands (sandboxes and tools), and the session (an event log). Each is an interface with minimal assumptions about the others. A failure in one doesn't kill the system, execution is isolated from context, and you can plug in new harnesses as they evolve. Prompt caching, compaction, and automatic recovery are all built in.

Cost and limits

Standard API token rates plus $0.08 per active session hour. A ten-minute coding session costs a few cents in compute on top of tokens. API rate limits: 60 writes/min for resource creation, 600 reads/min for gets, lists, and streaming. Your org and plan limits stack on top.

Launch checklist

  1. Get an API key at console.anthropic.com
  2. Install the CLI or SDK
  3. Export ANTHROPIC_API_KEY
  4. Create an agent (model + prompt + tools)
  5. Create an environment (container + packages + networking)
  6. Start a session and send a task
  7. Handle the event stream in your app

The bottom line

Managed Agents is the shortest path from "we want an agent" to "it's running in production." If your infrastructure is the blocker, switch. If you need a custom harness and total control, stay on Messages. Either way, the cost of hand-building a sandbox and event log just went to zero as a default choice.

For the full command reference, see the Claude Code cheatsheet, or browse other deep dives on the blog index.

Want the full Claude Code reference? Open the cheatsheet →