How to Build a Claude Code Agent That Learns From Each Loop (Exact Setup Inside)

Most looping agents have amnesia. Each cycle starts fresh, so they retry the same failed fix three times because nothing remembers it already didn't work.

A learning agent keeps a journal: after every attempt it writes down what it tried and whether it worked, then reads that journal before the next attempt.

Set it up once and the loop stops repeating itself. Each pass starts from what the last one learned instead of from zero.

Here's the full setup you need👇

Before we dive in, I share daily notes on AI & vibe coding in my Telegram channel: https://t.me/zodchixquant🧠

Why loops repeat themselves

A normal loop runs the same agent on the same problem again and again, but the agent's memory resets between cycles. It doesn't know cycle 2 already tried the thing it's about to try in cycle 4.

So it loops in circles, swapping the same library, reverting, swapping it back, burning tokens rediscovering dead ends. The loop runs, but it doesn't learn.

The fix is memory that survives between cycles: a journal the agent writes to and reads from every pass.

File 1: the journal

This is the memory. A plain file the agent appends to, never overwrites. Create .claude/loop-journal.md:

# Loop journal
Append-only. Each attempt: what was tried, the result, the lesson.

## Task: fix failing checkout test

### Attempt 1
Tried: added await to the fetchCart call.
Result: still failed, same error.
Lesson: the race isn't in fetchCart. Look upstream at the cart state.

### Attempt 2
Tried: memoized the cart selector.
Result: failed differently now, cart is undefined on first render.
Lesson: memoization changed timing. The real issue is initial state.

The journal is dead simple on purpose. Every entry is three lines: what was tried, what happened, and the lesson for next time.

The lesson line is the gold, it's what stops the next cycle from repeating the attempt.---

File 2: the learning loop

This is the orchestrator, and the one rule that makes it learn: read the journal first.

Drop into .claude/commands/learn-loop.md:

---
description: Run a task in a loop that learns from a journal each pass
argument-hint: <task>
allowed-tools: Read, Write, Edit, Glob, Grep, Bash
model: sonnet
---

Task: $ARGUMENTS

Each cycle:
1. Read .claude/loop-journal.md fully. Note what was already
   tried and what was learned. Never repeat a failed attempt.
2. Form a new hypothesis that the journal doesn't rule out.
3. Make the change. Run the check.
4. Append to the journal: what you tried, the result, the lesson.
5. Passed: stop, summarize what worked. Failed: go to step 1.
6. Cap at 6 cycles.

The rule: every cycle must try something the journal hasn't.
If you can't think of one, say so and stop. That's not failure,
that's the journal telling you to get a human.

Step 1 is the whole article. By forcing the agent to read its own history before acting, each cycle builds on the last instead of restarting. The loop accumulates knowledge instead of spinning.

File 3: enforce the write

An agent will skip journaling if it's optional, so make it automatic. This hook reminds the agent to log after every edit. Add to .claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          { "type": "command", "command": "test -f .claude/loop-journal.md && echo 'Reminder: log this attempt and its result in loop-journal.md before the next cycle.'" }
        ]
      }
    ]
  }
}

It's a nudge, not a wall, but it works. After each change the agent gets reminded to record what it just did, so the journal stays current. A journal written as you go is accurate.

One written from memory is fiction.

What you actually see

You type one line:

/learn-loop the checkout total is wrong when a coupon is applied

Then each cycle builds on the last:

Cycle 1
  journal: empty
  tried: recalculated total after coupon. failed, off by tax.
  logged: coupon applied before tax, should be after.

Cycle 2
  journal: cycle 1 says coupon goes after tax
  tried: moved coupon after tax. failed, now negative on big coupons.
  logged: no floor on discount. cap at item total.

Cycle 3
  journal: 2 lessons, both applied
  tried: capped discount at item total. PASSED.

done in 3 cycles, 0 repeats. journal saved for next time.

Notice cycle 2 didn't retry cycle 1's mistake, the journal ruled it out. And the journal survives, so next time a coupon bug shows up, the agent starts with everything it already learned.

Level up: memory that spans every task

The journal so far learns within one task. The bigger win is memory that carries across all of them, so the agent stops making the same class of mistake on every new feature.

Add a second, permanent file, .claude/lessons.md, for patterns worth keeping forever:

# Lessons (permanent)
Patterns the agent has learned the hard way. Read before any task.

## This codebase
- Dates are always UTC in the DB, local only in the UI layer.
- The auth middleware runs twice in tests, mock it once per file.
- Never trust the cached user object, it lags by one request.

## General reflexes
- A test that passes alone but fails in the suite = shared state.
- "undefined is not a function" after a refactor = a moved export.
- Money math: integer cents, never floats. Every time.

Then add one line to the loop command, near the top:

0. Read .claude/lessons.md before anything. These are
   permanent rules learned from past tasks. Apply them upfront.

Now when a loop finds something that isn't a one-off, the agent promotes it from the task journal to lessons.md.

The next task, even weeks later, starts with that baked in.

The journal is short-term memory, lessons.md is long-term: the difference between an agent that learns a task and one that learns your codebase.

How to write a lesson that actually helps

A vague lesson is worse than none, because it adds noise without preventing a repeat. The format decides whether the journal makes the next loop smarter or just longer.

Bad lesson:

Something was wrong with the dates.

Good lesson:

Tried: compared timestamps directly. Failed, off by timezone.
Lesson: DB stores UTC, the input was local. Convert before
comparing. Symptom to watch for: off-by-hours in date tests.

The good one names three things: what was tried, the actual cause, and the symptom that signals this problem next time.

That third part makes it reusable. When the agent sees "off-by-hours in a date test" three weeks later, it jumps straight to the timezone fix.

A lesson is only worth writing if a future cycle could pattern-match on it.

Common mistakes

Overwriting the journal. If each cycle replaces the file instead of appending, you lose the history and the loop forgets again. Append-only, always. The whole value is the accumulation.

Logging the attempt but not the lesson. "Tried X, failed" tells the next cycle what happened but not what to do differently. The lesson line is what prevents the repeat. Without it, the journal is just a diary, not a guide.

Letting the journal grow forever. After 30 attempts the journal itself bloats the context. Once a task is solved, archive its entries to a separate file and start the active journal clean.

Reading the journal at the end, not the start. Some agents log diligently but only check the journal when they're stuck. The read goes first, every cycle, or the memory does nothing.

Never promoting lessons to permanent. If everything stays in the per-task journal, the agent relearns your codebase's quirks on every new feature. Promote the patterns that repeat to lessons.md, or you're paying to learn the same thing twice.

The 12-minute setup

2 minutes: create .claude/loop-journal.md with the three-line entry format.

2 minutes: create .claude/lessons.md for permanent, cross-task patterns.

3 minutes: create the learning loop at .claude/commands/learn-loop.md, with the read-lessons-first step.

2 minutes: add the journaling reminder hook to settings.json.

3 minutes: run /learn-loop on a real bug and watch each cycle skip what the last one ruled out.

The agent didn't get smarter. It just stopped forgetting, and a loop that remembers beats a loop that repeats every single time.

Thanks for reading!

I share daily notes on AI, finance, and vibe coding in my Telegram channel: https://t.me/zodchixquant