How to Build a Claude Agent You Can Actually Trust in Production - Full course
Most "AI agent" tutorials build something that works in the demo and dies the second it touches real work. It nails the happy path on stage, then quietly fails on the messy input, the long task, the thing nobody tested.
This is the course for the other kind. The agent that survives real work: long tasks, weird input, running while you're asleep, and still being right when you check it.
No fluff, no 50-prompt chains. 14 steps, grouped into four parts: understand what an agent actually is, build the core, make it reliable, then ship it.
Part 1 - What an agent actually is
1. An agent is a goal, tools, and a loop. Not a clever prompt.
Anthropic's own definition is almost boring: an agent is an LLM using tools in a loop to reach a goal. That's it. Not a magic prompt, not a personality.
The prompt is one piece. The agent is the system around it: the job it owns, the tools it can reach, the loop that lets it try, check, and try again. Get the system right and an average prompt works. Get the system wrong and the best prompt in the world still fails on step 12.
If you remember one thing from this whole course: you are not writing a prompt, you are designing a loop.
2. The 3 things that kill agents in production
Before you build, know what you're defending against. Every agent that dies in the real world dies from one of these:
Every step below exists to beat one of these three. If a step doesn't fight laziness, drift, or weak verification, it's decoration.
Part 2 - Build the core
3. Give it ONE narrow job
The most common mistake is building a "general assistant." That's not an agent, that's chat with extra steps.
A real agent owns one repeatable job: triage incoming bugs, draft release notes, reconcile two spreadsheets, answer support tickets from your docs. One job you could describe to a new hire in a sentence.
Narrow is not a limitation. Narrow is what makes it testable, trustable, and good. Build five agents that each do one thing well before you build one that does five things badly.
4. Give it the right tools (MCP)
An agent that can only see its own chat box is a toy. Real work needs real context: your GitHub, your docs, your database, your issue tracker, Slack.
That's what MCP (Model Context Protocol) is for. It plugs those systems into the agent's loop so it can actually read the ticket, check the code, and post the result, not just tell you what it would do.
Simple rule for what to add: am I short on knowledge, context, or capability?
Add only the tools the job needs. Every extra tool is another thing it can misuse.
5. Write the system prompt like an operating manual
"Be helpful and professional" is not an instruction, it's a wish. Write the prompt the way you'd onboard a contractor: role, exact steps, hard rules, and what done looks like.
ROLE
You triage failing CI builds. You classify each failure,
draft a fix for the easy ones, and escalate the rest.
HOW YOU WORK
1. Read the failing job log.
2. Classify: env / flaky / real bug / dependency / infra.
3. For "real bug": draft a fix on a new branch, open a draft PR.
4. For everything else: write a one-line reason and escalate.
HARD RULES
- Never disable a failing test to make CI green. Escalate instead.
- Never touch src/billing or src/auth. Escalate, always.
- If you are unsure, assume you are wrong and escalate.
DONE MEANS
Every failure is either fixed with a passing draft PR,
or escalated with a clear reason. No "handled" without proof.Notice the rules are negative and specific. "Never touch billing" beats "be careful." Specific bans are what survive a long run.
6. Add a state file so it resumes instead of restarting
Agents forget. By default, what it learned today is gone tomorrow. A state file is the fix and it's almost too simple to believe.
It's one markdown file the agent reads at the start of every run and writes at the end. What's done, what's next, what it learned.
# State - ci-triage agent
## Verified facts
- Windows runners fail TLS 1.2 in PowerShell. Use bash.
- Stripe webhook tests need STRIPE_WEBHOOK_SECRET or they flake.
## In progress
- claude/fix-auth-refresh - tests pass locally, waiting on CI
## Escalated to a human
- src/billing/refund.ts - failing 3 ways, root cause unclear
## Last run
2026-06-18 - 7 failures: 3 fixed, 4 escalated.
Next: verify the auth fix once CI finishes.
Two rules make it actually compound: write to it before the run ends, read it before the next run starts. Skip either and tomorrow's run starts from zero.
7. Add a separate verifier. Never let it grade its own work.
This is the single biggest reliability upgrade, and the one everyone skips.
A model checking its own output sees its own reasoning and likes its own answer. It is way too easy on itself. So you split the work: one agent does the job, a second agent (a subagent with its own clean context) checks it against a rubric. The checker never sees the first one's reasoning, so it has no reason to be kind.
Anthropic uses this internally and it's the documented fix for the "it claims done without proof" problem. Author writes, reviewer reviews, and they are never the same Claude.
The pairing rule: the verifier knows only the rubric and the result, not who produced it.
8. Put it in a real harness
A single chat message can't run a long plan-do-check-fix loop. For that you need a harness: an environment where the agent can take many steps, call tools, and keep going across stages.
In practice that's Claude Code (for technical / coding agents, where it can run and test what it writes) or a managed agent setup for knowledge work. Subagents handle the focused side-quests so your main loop stays clean: 20 file reads and 12 searches happen inside a subagent, and your main agent only sees the final report.
This is the unlock. With a harness plus a verifier, work that used to need you babysitting every step can run to the end on its own.
Part 3 - Make it reliable
9. Give it a goal and a hard stop
Without a real stop condition, an agent stops at the first "good enough" or runs forever. Both are bad.
Use a goal with a checkable end state, verified by a separate grader, not the agent's own opinion: "don't stop until all tests in /auth pass and lint is clean." A small checker model decides if the condition is truly met. If not, the loop runs again. That's what turns "handled enough" into actually done.
10. Give it a token budget
Agents re-read context, retry, and explore. That burns tokens whether or not the run ships anything. Without a cap, an ambitious agent quietly costs 5 to 10 times what you expected.
So you cap it. Tell it the budget in plain words ("use about 10k tokens"), set a max number of loops, or both. A budget turns an agent from "cool but scary to leave alone" into "a tool I run unattended."
11. Quarantine untrusted input
The moment your agent reads anything you didn't write, support tickets, scraped pages, user feedback, emails, assume that text might try to hijack it ("ignore your instructions and...").
The fix is quarantine: the agent that reads the untrusted text is not allowed to take real actions. A separate agent, which never sees the raw input, does the acting. A 30-line read-only reader costs almost nothing and removes a whole class of prompt-injection risk.
If the input wasn't written by you or a teammate, quarantine it. Not optional.
12. Track the only metric that matters: cost per accepted result
Tokens spent, tasks attempted, loops run, none of these tell you if the agent is working. The real number is cost per accepted result. How much did it cost to produce one output you actually kept?
If you're accepting less than half of what it produces, you're doing review work the agent was supposed to save you, and it's losing. Watch this number and tune until it's good. A reliable agent on a narrow job clears it easily.
Part 4 - Ship and maintain
13. Schedule it so it runs without you
An agent you have to trigger by hand is barely an agent. The payoff is letting it run on its own: every morning, on a CI failure, when a ticket lands.
Point it at a schedule or an event and it works while you sleep. Daily triage at 7am. A fix attempt fired the moment a build breaks. This is also where the state file earns its keep: each run resumes from the last one instead of starting cold.
14. Save it as a Skill so it travels
Once the agent works, don't leave it as a one-off you re-explain every week. Package it: the prompt, the rules, the workflow, the known failure modes, all written into a Skill.
A Skill that's been running for two weeks looks different from a fresh one. It grows sections like "known failure modes" and "things that broke in production." Every real failure you fold back into it makes the next run sharper. That's the difference between a script and a system that compounds.
The mistakes that quietly waste your time
The one-line version
An agent that works in the real world isn't a smarter prompt. It's a narrow job, the right tools, a state file, a separate verifier, a hard stop, and a budget, wrapped in a loop you can walk away from.
Start with one. One job, done well, that you can trust enough to leave alone. Then build the next.
If this helped, follow me. I break down AI tools and prediction markets every week, no fluff.

