@alex_prompter: Everyone says “LLMs are black ...
@alex_prompter
34 views
Oct 29, 2025
1
Everyone says “LLMs are black boxes.”
This paper "How Do LLMs Use Their Depth?” just opened one and showed how intelligence forms layer by layer.
They follow a “Guess → Refine” strategy:
• Early layers make statistical guesses using frequent tokens (“the”, “of”, “and”)
• Middle layers pull in context to test those guesses
• Later layers refine them into precise, context-aware predictions
Across GPT-2-XL, Llama-2-7B, Llama-3-8B, and Pythia-6.9B, ~80% of early guesses get replaced before the final layer.
Even cooler: models use depth dynamically easy tasks (like punctuation or determiners) finish early, while hard ones (like fact recall or reasoning) go deeper.
In short:
LLMs aren’t just deep networks. They’re layered thinkers early guessers, late reasoners.
Paper: arxiv. org/abs/2510.18871
This paper "How Do LLMs Use Their Depth?” just opened one and showed how intelligence forms layer by layer.
They follow a “Guess → Refine” strategy:
• Early layers make statistical guesses using frequent tokens (“the”, “of”, “and”)
• Middle layers pull in context to test those guesses
• Later layers refine them into precise, context-aware predictions
Across GPT-2-XL, Llama-2-7B, Llama-3-8B, and Pythia-6.9B, ~80% of early guesses get replaced before the final layer.
Even cooler: models use depth dynamically easy tasks (like punctuation or determiners) finish early, while hard ones (like fact recall or reasoning) go deeper.
In short:
LLMs aren’t just deep networks. They’re layered thinkers early guessers, late reasoners.
Paper: arxiv. org/abs/2510.18871
9
LLMs don’t think in one pass.
They guess, test, refine, and decide across their depth.
Each layer isn’t just computation it’s a thought step.
We’re literally watching models reason in slow motion.
github.com/akshat57/how-d…
They guess, test, refine, and decide across their depth.
Each layer isn’t just computation it’s a thought step.
We’re literally watching models reason in slow motion.
github.com/akshat57/how-d…







