Everyone's asking "WTF is a loop?" Here's the question nobody's asking: what runs the loop?

The AI discourse has converged on loops as a core primitive of agentic systems. Matt Van Horn (@mvanhorn) traced the
View Tweetfrom ReAct to tool-use to orchestration loops to loops supervising loops. Addy Osmani (@addyosmani) broke down the building blocks inside loops: automations, worktrees, skills, connectors, sub-agents. Van Horn landed on durability, arguing that loops which can't survive a restart aren't loops. Osmani's key thread was orchestration: design the system that prompts the agent instead of you.
I want to take their points further. Durability isn't just a property of the loop. It's the entire execution layer underneath it. The important fact is that durable orchestration is fundamental to building your agent loop architecture. Let's break down that architecture.
# Where loops break
The /loop and /goal patterns handle single-agent, single-session work well. An agent loops until a task is done. That covers a lot of ground. But the next stage (Stage 5 in Van Horn's framing) is where it falls apart:
• Loops supervising other loops
• Loops running on schedules, not just triggered by a human
• Loops that survive process restarts, deploys, and crashes
• Loops that spawn sub-agents and wait for results (sometimes hours later)
• Loops that need to be observable after the fact
That's not a prompting problem. That's an infrastructure problem.
Van Horn cites @runes_leo: "The costliest thing in AI coding is no longer writing code, it's managing the agent loop." A while True in a terminal doesn't give you any of this. Neither does a long-running process on a VM or sandbox.
Think about what happens when you run an agent loop on a server. The process will die or restarts. A deploy, an OOM, a spot instance reclamation. The loop restarts. But what was it doing? Which step was it on? Did it already send that Slack message? Did it already invoke the sub-agent?
You don't know. It starts over. Re-fetches data it already had. Re-calls the LLM for decisions it already made. Sends a duplicate notification. Spawns a duplicate sub-agent. You wake up to three identical Slack messages and a confused team.
The fix isn't "better error handling" — it's an execution model where each step is checkpointed, each decision is persisted, and recovery means resuming from the last successful step.
# The agent loop architecture in three layers
Three layers. Each one maps to a concrete primitive.
## Layer 1: The Loop
A loop is a cron plus a decision-maker. It runs on a schedule (or a trigger), evaluates state, and decides what to do next.
This is Van Horn's definition made concrete: what cron never had is the decision in the middle. The agent decides, not you. The cron is the heartbeat. The LLM is the decision-maker. Steps are the durable execution that checkpoint progress.
Generated by Thread Navigator
Press ⌘ + S to quick-export
