I built a swarm where agents can fire each other. Performance went up. Then It Started Eating Itself

I had a swarm running. 80 agents on the same task, the kind where you can check the answer at the end. About a third of them were quietly garbage.

I did what everyone does. Averaged all 80. Throw a pile of agents at it, average, the mess washes out. Error came back at 0.99. Useless.

So I tried something else. I let the agents grade each other against a small set of questions where I already knew the answer, and fire the worst. Cut the bad ones, average who's left.

0.135.

86% of the error, gone. Same agents. I didn't add anything. I removed.

Why more agents was never the answer

If your agents are wrong in random, independent ways, adding more cancels the wrongness out. That's the whole pitch, and it's true.

But they all came off the same model. So they miss together. Same hallucinated convention, same misread of the spec, all leaning the same way. Averaging a stack of numbers that lean the same way doesn't move the lean.

Agent 300, agent 400, doesn't matter. The agent count on the slide is the most worthless number in the system, and nobody wants to hear it.

So you cut instead

Stop trying to drown the bad agents. Remove them.

You need a verify gate. A few questions where you know the truth. Tests, anchors, whatever you have. Score every agent, cut the worst, average the survivors. 0.99 to 0.135.

A plain median on the same dirty swarm gives 0.56. A 20% trimmed mean, 0.82. The firing, 0.135.

Median and trim are blind. They cut a fixed amount and hope. Firing isn't blind. Same idea as trimming, except it knows where the bodies are buried.

But you can't just crank it

Firing is not a slider you push to 100.

I pushed it. Error dropped, bottomed out, then climbed straight back up. 128% above the bottom by the time I'd gutted nearly everyone. Cut too deep and four agents are holding the whole answer, and four agents is loud and shaky.

The bottom sits further out than your gut says. 30% of my agents were bad. The best cut was 70%.

You fire all the way down to the core you'd actually bet on. The mediocre ones drag the mean too. Cut them, ride your best.

The verifier

Everything above leans on that gate. So I starved it.

Three check questions: best error 0.157. Ten, 0.138. Forty, 0.127.

With a thin gate, the deeper you cut the more good agents you fire by accident, because the ranking is too noisy to tell who's who down there. With a fat gate you cut hard and stay on the floor.

So the verifier isn't a tax on the swarm. It sets how low you can get, and how hard you can push before the thing turns on you. Spend your budget on the checker before agent 301. A small swarm with a sharp gate beats a giant one with a blurry gate.

Now the part that matters

So far the firing was grounded in truth. The viral version isn't. The viral version is agents firing each other. Pure peer vote. Every agent votes out the peers least like itself.

And it works. For a while. As long as the good agents are the majority, the peer vote lands where the grounded one did. The honest crowd out-votes the bad ones and tosses them.

Then I cranked up the bad-agent share. Around 48% it snapped. Error jumped from 0.64 to 2.12 between one step and the next.

Past that line the bad agents have the votes. And the first thing they vote out is your competent core, because those are the agents least like them. The swarm executes its own best people. Peer firing goes from your sharpest tool to worse than nothing.

From inside, you can't see it. The vote looks like clean consensus the whole way down. They are confidently deleting the only agents worth keeping.

The grounded gate never does this. It answers to something the agents don't get a vote on.

Never let the swarm elect its own executioner. Whatever pulls the trigger has to answer to something outside the crowd. A test, an anchor, an independent checker. The moment firing is just what the agents think, you're one bad majority from it eating itself.

It's the old maker/checker rule, blown up from one agent to a whole population. We keep relearning it at bigger N.

One more thing, before you call it a quirk of my setup. I ran the same thing on a yes/no vote with garbage voters mixed in. 69% right, raw. Fire the worst, 97%. Fire too many, slides back down. Same valley.

Two things decide whether a swarm lives or dies: who gets to fire, and what they're allowed to aim at.

Get them right and a tiny swarm with a sharp verifier eats a giant one.

Get them wrong and it shreds its best work while telling you it's sure.

I wrote the whole thing up. 300 runs per point, the full curve, the verifier sweep, the 48% cliff, the code in the PDF.

14 pages. Link's below.

https://drive.google.com/file/d/1utiqCp-1fNx0YGsewAEFxMGx0s6o4A1Y/view?usp=sharing