Visualize Thread by @rohanpaul_ai

✨ Visual Editor

palette Canvas & Background

Presets

Custom Colors

Gradient:arrow_forward

Text Color:

Gradient Angle135°

Background Pattern

Grain Texture

Aspect Ratio

style Card Style

Preset

Padding40px

Card Radius16px

Enable Card Shadow

Glassmorphism Effect

Show Watermark AGENCY

Show Timestamps

Show X Logo

text_fields Typography

Font Family

Font Size16px

Rohan Paul

@rohanpaul_ai

Now the 3rd paper comes on this 🤯

"The Illusion of the Illusion of the Illusion of Thinking"

📌1st original Paper from Apple, concludes that large reasoning models reach a complexity point where accuracy collapses to zero and even spend fewer thinking tokens, revealing hard limits on generalizable reasoning.

📌2nd Paper counters that the apparent collapse is an illusion caused by token limits and impossible puzzles, so the models’ reasoning remains sound when evaluations remove those flaws.

📌3rd paper synthesizes both sides, agreeing the collapse was an artifact yet stressing that models still falter in very long step-by-step executions, exposing lingering brittleness despite better methodology.

The third author shows that, even after fixing the test design and giving enough output space, the models still start to lose track of a long step-by-step plan once it stretches into the thousands, so a real weakness remains in sustaining very long chains of reasoning.

Read on 👇

Rohan Paul

@rohanpaul_ai

🔎 Agreements

The 3rd paper endorses 3 key fixes raised by the 2nd paper

Unsolvable River Crossing cases with actors above 5 and a boat of size 3 should never have been graded.
Token budgets cap Tower of Hanoi output long before logic fails.

Calling exponential move count “complexity” confuses length with search difficulty.

Rohan Paul

@rohanpaul_ai

3rd paper - drive.google.com/file/d/1imWKj_…

Generated by Thread Navigator

100%

view_carousel Carousel Studio NEW

Press ⌘ + S to quick-export