SERIES: ANTHROPIC CLAUDE | 07 of 10
Autonomous AI That Plans, Acts & Collaborates
Building intelligent systems with Claude's agentic capabilities
An AI agent is a system that uses an LLM to autonomously decide which actions to take, execute those actions, observe the results, and iterate toward a goal.
| Capability | Chatbot | Agent |
|---|---|---|
| Tool usage | None or limited | Extensive, dynamic tool selection |
| Planning | Single response | Multi-step plan generation |
| Iteration | Stateless per turn | Loops until goal is achieved |
| Error recovery | User must re-prompt | Self-correcting behavior |
Every agent follows a fundamental observe → think → act → iterate cycle until the task is complete or a stopping condition is met.
Claude is purpose-built for agentic workloads — its architecture and training make it one of the strongest foundation models for autonomous task execution.
Claude naturally breaks down complex tasks into sub-tasks, validates intermediate results, backtracks when errors are detected, and provides reasoning transparency through extended thinking.
The agents.md (or AGENTS.md) file tells Claude who it is, what it can do, and how it should behave within a project. It lives at the root of your repository.
# AGENTS.md
## Identity
You are a senior backend engineer assistant
specializing in Python microservices.
## Capabilities
- Read and modify files in src/
- Run tests with pytest
- Access the staging database (read-only)
- Create pull requests via GitHub CLI
## Guidelines
- Always run tests before committing
- Follow the project's PEP 8 style guide
- Never modify production configs
- Ask before deleting files
## Project Context
- Python 3.12 + FastAPI
- PostgreSQL with SQLAlchemy ORM
- Docker Compose for local dev
A markdown file that provides Claude with persistent context about your project and its role
## Identity
## Capabilities
## Guidelines
Claude Code automatically reads AGENTS.md, .claude/agents.md, and any AGENTS.md files in parent directories, merging them hierarchically.
The Anthropic Agent SDK provides a Python framework for building production-grade agents with Claude, including tool management, orchestration, and guardrails.
# Install the SDK
pip install anthropic-agents
# Or with extras
pip install anthropic-agents[all]
# Verify installation
python -c "import agents; print('OK')"
Agent — core agent classRunner — execution engineTool — tool definitionsGuardrail — safety checksHandoff — agent delegationfrom agents import Agent, Runner, tool
@tool
def read_file(path: str) -> str:
"""Read contents of a file."""
with open(path, 'r') as f:
return f.read()
@tool
def write_file(path: str, content: str) -> str:
"""Write content to a file."""
with open(path, 'w') as f:
f.write(content)
return f"Wrote {len(content)} bytes to {path}"
@tool
def run_tests() -> str:
"""Run the project test suite."""
import subprocess
result = subprocess.run(
["pytest", "--tb=short", "-q"],
capture_output=True, text=True
)
return result.stdout + result.stderr
# Define the agent
coding_agent = Agent(
name="CodingAssistant",
instructions="""You are a Python developer.
Read files, make changes, then run tests
to verify your work.""",
tools=[read_file, write_file, run_tests],
model="claude-sonnet-4-20250514",
)
# Run the agent
result = Runner.run_sync(
coding_agent,
"Fix the bug in src/parser.py where
empty input causes a crash"
)
print(result.final_output)
@tool decoratorAgent with instructionsRunner.run_sync()The agent enters its loop: reads the file, identifies the bug, writes a fix, runs tests, and iterates until tests pass.
Tools give agents the ability to interact with the real world. Each tool has a name, description, input schema, and an execution function.
# Simple tool with the @tool decorator
@tool
def search_codebase(
query: str,
file_type: str = "py"
) -> str:
"""Search the codebase for a pattern.
Args:
query: Regex pattern to search for
file_type: File extension filter
"""
import subprocess
result = subprocess.run(
["rg", query, "--type", file_type,
"--json"],
capture_output=True, text=True
)
return result.stdout
# Tool with complex schema
@tool
def create_github_issue(
title: str,
body: str,
labels: list[str] = [],
assignees: list[str] = []
) -> str:
"""Create a GitHub issue."""
# Implementation here
return f"Created issue: {title}"
@tool decorator for simplicity{
"name": "search_codebase",
"description": "Search the codebase...",
"parameters": {
"query": {"type": "string"},
"file_type": {
"type": "string",
"default": "py"
}
}
}
Agent decides to call tool → SDK validates inputs → function executes → result returned to agent → agent continues reasoning
Complex tasks benefit from multiple specialized agents working together. Three primary patterns dominate multi-agent design.
Central agent delegates subtasks to specialized workers
Each agent processes output from the previous stage sequentially
Agents communicate directly, no central coordinator needed
Multi-agent systems need well-defined communication protocols to share information, delegate work, and coordinate effectively.
result = await runner.handoff(
target=review_agent,
message="Review this PR",
context={"diff": diff}
)
ctx = SharedContext()
ctx.set("plan", plan_output)
# Other agents can read:
plan = ctx.get("plan")
agent = Agent(
name="Router",
handoffs=[
billing_agent,
support_agent,
technical_agent,
]
)
Claude Code is itself a fully-featured AI agent — it uses the agent loop pattern with a rich set of built-in tools to accomplish software engineering tasks.
| Tool | Purpose |
|---|---|
Read | Read files from the filesystem |
Edit | Make precise edits to files |
Write | Create new files |
Bash | Execute shell commands |
Glob | Find files by pattern |
Grep | Search file contents |
Claude Code can spawn sub-agents for specialized tasks — a dedicated agent for exploration, another for planning, and task-specific workers. This enables parallel investigation and division of labor within a single session.
Sub-agents are specialized, scoped instances that handle specific aspects of a larger task. They inherit context but have focused instructions and limited tool access.
Tools: Read, Glob, Grep
Tools: Read (for validation)
Tools: Read, Edit, Write, Bash
explorer = Agent(name="Explorer", tools=[Read, Glob, Grep],
instructions="Find all usages of the deprecated API...")
planner = Agent(name="Planner", tools=[Read],
instructions="Create a migration plan based on findings...")
worker = Agent(name="Worker", tools=[Read, Edit, Write, Bash],
instructions="Execute step N of the migration plan...")
# Orchestrator delegates sequentially
findings = await Runner.run(explorer, "Map deprecated API usage")
plan = await Runner.run(planner, f"Plan migration: {findings.output}")
result = await Runner.run(worker, f"Execute: {plan.output}")
Orchestration manages the lifecycle, coordination, and execution of multiple agents working toward a shared objective.
from agents import Agent, Runner
orchestrator = Agent(
name="ProjectManager",
instructions="""You coordinate a team:
- frontend_agent: React components
- backend_agent: API endpoints
- test_agent: Integration tests
Delegate tasks appropriately.""",
handoffs=[
frontend_agent,
backend_agent,
test_agent,
]
)
# The orchestrator decides who to call
result = await Runner.run(
orchestrator,
"Add a user profile page with API"
)
import asyncio
tasks = [
Runner.run(lint_agent, code),
Runner.run(test_agent, code),
Runner.run(docs_agent, code),
]
results = await asyncio.gather(*tasks)
Autonomous agents require robust safety measures. Guardrails prevent harmful actions, enforce policies, and maintain human oversight.
from agents import Agent, Guardrail, GuardrailResult
@guardrail
async def no_secrets_guardrail(ctx, agent, input_text) -> GuardrailResult:
"""Prevent the agent from outputting secrets or credentials."""
patterns = [r"sk-[a-zA-Z0-9]{48}", r"AKIA[0-9A-Z]{16}", r"ghp_[a-zA-Z0-9]{36}"]
for pattern in patterns:
if re.search(pattern, input_text):
return GuardrailResult(should_block=True, message="Blocked: potential secret detected")
return GuardrailResult(should_block=False)
agent = Agent(name="SafeAgent", guardrails=[no_secrets_guardrail])
Effective agents maintain context across interactions. Memory systems range from simple conversation history to persistent knowledge bases.
~/.claude/memory# Working memory (within session)
agent.context["current_task"] = task
agent.context["findings"] = []
# Persistent memory (across sessions)
memory_store = VectorMemory(
path="./agent_memory",
embedding_model="voyage-3"
)
memory_store.add("User prefers tabs")
relevant = memory_store.search(query)
Tools: GitHub API, Read, Grep
Model: Claude Sonnet (cost-effective)
Tools: SQL, Python, Slack API
Model: Claude Haiku (low-latency)
Tools: KB Search, CRM, Email
Model: Claude Sonnet + handoffs
All three follow the same pattern: trigger event → gather context → reason about action → execute with tools → verify & report. The difference is in the tools available and the domain-specific instructions.
Agents are harder to debug than traditional software because behavior emerges from LLM reasoning. Observability is the key to understanding agent decisions.
from agents import Agent, Runner
from agents.tracing import TracingConfig
# Enable detailed tracing
config = TracingConfig(
log_level="DEBUG",
trace_tools=True,
trace_reasoning=True,
output_dir="./traces"
)
result = await Runner.run(
agent, prompt,
tracing=config
)
# Inspect the trace
for step in result.trace.steps:
print(f"[{step.type}] {step.summary}")
if step.type == "tool_call":
print(f" Tool: {step.tool_name}")
print(f" Input: {step.input}")
print(f" Output: {step.output[:200]}")
elif step.type == "reasoning":
print(f" Thought: {step.text[:200]}")
rm -rf)Agent systems involve tradeoffs between latency, cost, and accuracy. Understanding these tradeoffs is critical for production deployments.
| Strategy | Latency Impact | Cost Impact | Accuracy Impact |
|---|---|---|---|
| Model routing (Haiku/Sonnet/Opus) | -40% latency | -60% cost | -10% accuracy |
| Parallel tool execution | -50% latency | Neutral | Neutral |
| Context summarization | -20% latency | -30% cost | -5% accuracy |
| Verification loop | +30% latency | +25% cost | +15% accuracy |
| Extended thinking | +20% latency | +15% cost | +20% accuracy |
Moving agents from prototype to production requires attention to scaling, monitoring, and compliance.
# Agent deployment architecture
services:
agent-gateway:
image: agent-gateway:latest
replicas: 3
environment:
- ANTHROPIC_API_KEY=${API_KEY}
- MAX_CONCURRENT_AGENTS=50
agent-worker:
image: agent-worker:latest
replicas: 10
resources:
limits: { memory: "2Gi", cpu: "1" }
redis:
image: redis:7-alpine # Task queue
Agent capabilities are expanding rapidly. The next wave brings more autonomy, richer modalities, and collaborative agent ecosystems.
Already emerging with Claude Code
Claude's vision capabilities enable this today
Research frontier, early experiments underway
Longer autonomy windows — agents running for hours, not minutes. Better tool ecosystems — standardized tool interfaces (MCP, now under the Linux Foundation's Agentic AI Foundation). Agent-to-agent protocols — standardized communication. Formal verification — proving agent safety properties mathematically.
agents.md defines agent identity & behavioragents.md alongside codeNext: 08 — Model Context Protocol (MCP)