✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135Β°

style Card Style

40px
16px

text_fields Typography

16px
dei
@parcadei
Introducing: Continuous Claude v4.7 (optimised for Opus 4.7)

strap in - we've got RLMs, 50% off Edits, 95% off Reads, fine-tuned models and even evolving codebases πŸ‘€

let's dive in to what's changed, what's new and what to doπŸ‘‡
Thread image
dei
@parcadei
/1 Model Differences

4.6 has choppy reasoning, short declarative framing and ends with a flowing conclusion.

4.7 is continuous throughout; a flowing reasoning from start to finish.

And due to 4.7's new tokeniser, input costs more tokens, and the old world of 17,000 lines of G-Stack "Tolstoy edition" no longer produces the desired result.

And the larger the input, the greater the likelihood of you finding yourself in the realms of context rot in fewer turns than before...

and then you look around and you're in the world of curse words and high cortisol so...

How do we fix it?
dei
@parcadei
Solution #1

Bloks: a CLI tool that turns any library into agent-shaped context cards
github.com/parcadei/bloks

bloks react useState β†’ 10 tokens Signature + the gotcha that trips you up

bloks deck hono β†’ 20 tokens = Every module at a glance

bloks context β†’ reads your package.json, surfaces all your deps in one block

And bloks learn β†’Agent debugs a mistake β†’ bloks learn hono "cors before routes" β†’ no agent ever makes that mistake again

Corrections compound across sessions

15 languages. Indexes from source, not stale docs. Self-corrects with use.

And it's open source and free (I was going to charge a per card transaction fee but Claude talked me out if it)
Thread image
dei
@parcadei
Problem #2: Ed, Edd, n Edit

AI agents edit code millions of times a day. Every single time, they waste tokens describing changes to files.

The problem: an AI wants to change 2 lines in a 200-line file and it normally has two options:

1) Unified diff: output the changed lines + context + line numbers

2) Search/Replace: output the old code AND the new code.

but what if you could half the token cost for edits?
dei
@parcadei
Introducing: FastEdit

github.com/parcadei/faste…
huggingface.co/continuous-lab…

No search block. No matching old code character-by-character.

FastEdit scopes to the target function via AST, then splices in changes deterministically. 74% of edits use zero tokens and take <1ms.

The other 26% go to a local 1.7B fine-tuned merge model that only sees the requested change, and merges it in.

And voila, you get ~40-50% tokens saved on code edits.

(Note: Claude said I shouldn't charge a per edit fee here, this new model is too aligned fml)
Thread image
dei
@parcadei
Problem #3

AI agents read code millions of times a day. Every single time, they waste tokens ingesting files they barely need.

The problem: an AI wants to understand how a module works and it normally has two options:

1. Read the whole file: ingest 200 lines to find the 3 functions that matter.

2. Grep and pray: search for a keyword, hope the results have enough context, read each match anyway.

But what if you could give the agent the full picture in ~5% of the tokens?
dei
@parcadei
Solution: TLDR (rust edition: faster, bigger, better)
github.com/parcadei/tldr-…

With TLDR, agents get the the same understanding in a fraction of the token spend

TLDR extracts structure instead of dumping text and the result is 95% fewer tokens

while preserving everything needed to understand, edit and write code correctly

There's over 60+ commands covering a range of tasks in over 13 languages like:

"What calls this function?" β†’ instead of grepping 500 files and reading each one, tldr impact returns the exact call chain in 12 lines of JSON.

"What broke?" β†’ instead of reading 200 lines to find a bug, tldr slice isolates the 11 lines that actually affect the variable at the crash site.

(Opus urgently talked me out of charging a per read transaction fee here - I'm sure Grok would've let me)
Thread image
dei
@parcadei
Problem #5 "Research Limits"

We've all been there, on a deep research bend and all of a sudden BAM! usage limit hit and we're locked out

There goes that Quant Trading Strategy Claude said was guaranteed to work once we finished research...

How do we resolve the problem of research eating up tons of context and our limits?

Easy... RLMs
dei
@parcadei
RLMs baby, I never leave home without them

The common criticism you hear is, "Well, isn't an RLM just basically calling a sub-agent?"

My answer is "Same same, but different...but still same"

The key mental pivot I had is that you're treating the llm as a function call within a program

In a harness, a sub-agent gets it own context window, it researches, and it produces an artefact
but a RLM lives inside a persistent REPL harness where you can do a metric fuck ton of composability and you get:
1/ Cross-iteration memory for free.

Iteration 1 worker finds 5 papers, stores them as papers_H001.

Iteration 2 worker does --load, has papers_H001 already in scope.

No more burning tokens by stuffing prior findings into the orchestrator’s prompt, orrelying on lossy summaries.

2/ Computation without context cost.

The heavy lifting (reading search results, crunching data) happens inside the REPL

The orchestrator’s own context never sees any of it. It can write scripts, delegate to multiple sub-LLMs, whatever your heart desires.

There's zero token burn on the raw data including results don't return back to the agent but get stored as a variable, which lets the agent use the saved context to work with the information rather than for.

3/ Parallel isolation via fork.

Two workers researching H-001 and H-002 can --fork the session. Each gets its own independent copy of the shared state.

They run in parallel but start from the same rich context instead of zero.

It's weird, takes some getting used to and I'm still not convinced I've squeezed everything out or even fully get it
Thread image
dei
@parcadei
github.com/parcadei/ouros/

TL;DR:
Ouros is a persistent heap that sits outside any agent's context window.

Research data just accumulates there across iterations and workers, all at basically zero token cost to the orchestrator or any individual worker.
dei
@parcadei
Problem #4 Harness Bloat

CCv3 works, but it's a cathedral - complex to install, complex to maintain, and skills decay without manual upkeep

CCv4 is my interim harness whilst I perfect my [Continuous + MIMIR] agent and carries some of the new learnings.

What changed?github.com/parcadei/Conti…
dei
@parcadei
30 agents β†’ 2 workers

the model doesn't need a persona to do research vs implementation - now there's just oracle and worker

all you need is a clear task prompt and the right context especially since 4.7 works better with less

'Scout' and 'architect' and 'kraken' were just system prompts pretending to be specialisation

100 skills β†’ 5 workflows. Most skills were knowledge cards β€” 'design taste', 'react patterns', 'TypeScript rules.'

those aren't workflows anymore, they're bloks: atomic, versioned, scored by real usage

The 5 remaining skills are pure orchestration: bootup β†’ research β†’ autonomous β†’ handoffs
dei
@parcadei
bootup:

deterministically scores your repo across a range of readiness criteria using tldr (shoutout droid for the genius idea)

Missing linter? Installs it
No pre-commit hooks? Adds them

Gets the project to agent ready before any agent touches the code
dei
@parcadei
autonomous (work + research):

Assess β†’ Plan β†’ Premortem β†’ Prepare β†’ Execute β†’ Validate β†’ Evolve

Plans are validation contracts, done is defined before code exists

Workers are atomic: one task, one assertion, one report

Evolve reads what broke and upgrades your tooling

Let Claude run for hours, days, weeks building your next AI Calorie Tracker App

(research version of the skill uses Ouros to run open ended research)
dei
@parcadei
handoffs:

serialises everything the next session needs: what was done, what's pending, what broke, which files matter

resume picks up the autonomous session from its last pending milestone

Context transfers between sessions without the 'start from scratch' problem
dei
@parcadei
but what about memory?

Memory is an active project (MIMIR) - the old system was good but it failed in too many ways to be useful at scale

issues like continual upkeep and pruning, high token cost of background filling and mistimed recall, etc

I'm rebuilding it from scratch to sota so it's actually useful and not a high cost, heavy suitcase the model has to lug around

And when you have a codebase that evolves by learning from failure, bloks to maintain context, tldr for reading and fastedit for half price edits

there's little time to say "damn don't you remember XYZ"
dei
@parcadei
But what about cross-terminal co-ordination, and everything else?

You can keep what you like from CCv3 if you find it useful, this is a fresh repo (oh no my stars hahaha)

clone it, have Claude integrate it and use it how you want

I personally only use CCv4 because the global Claude is the same everywhere on first load

And evolves at a project level which means I don't need to manage everything, everywhere all at once
dei
@parcadei
BUT I USE [insert harness]

And that's fine!

bloks, tldr, fastedit and ouros can be used in any harness from codex, opencode, cursor, hermes, etc

and for the skills or scripts - you can copy & import them straight in

If you get stuck, just message me or tag me and I'll help you wire it up

As always...

fork it, make it your own, take what you want and if something is broken, ask Claude

and if he can't fix it, dm me and i'll ask my Claude 🫑

(Unless I've forgot to include something in one of the repos; then tag me πŸ˜‚)
github.com/parcadei/Conti…
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press ⌘ + S to quick-export