| Thread Navigator

Canvas & Ratio

Choose your destination platform format

Layout Template

Choose a content structure for your slides

Preset Themes

Typography & Sizing

Font Family

Title Font Size36px

Body Font Size18px

Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)

Active Brand Profile

Show Brand Watermark

Brand Watermark Text

Social Handle

Brand Logo URL (PNG) AGENCY

SAVE PRESETS (AGENCY)

Save current as Preset

Outro Slide CTA

Customize your closing call-to-action slide

CTA Title

CTA Message & Emojis

Custom CTA Buttons

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1

elvis

@omarsar0

Hierarchical Reasoning Model This is one of the most interesting ideas on reasoning I've read in the past couple of months. It uses a recurrent architecture for impressive hierarchical reasoning. Here are my notes:

Apply Image

Drag Post #2

elvis

@omarsar0

The paper proposes a novel, brain-inspired architecture that replaces CoT prompting with a recurrent model designed for deep, latent computation.

Apply Image

Drag Post #3

elvis

@omarsar0

It moves away from token-level reasoning by using two coupled modules: a slow, high-level planner and a fast, low-level executor. The two recurrent networks operate at different timescales to collaboratively solve tasks Leads to greater reasoning depth and efficiency with only 27M parameters and no pretraining!

Drag Post #4

elvis

@omarsar0

Despite its small size and minimal training data (~1k examples), HRM solves complex tasks like ARC, Sudoku-Extreme, and 30×30 maze navigation, where CoT-based LLMs fail.

Apply Image

Drag Post #5

elvis

@omarsar0

HRM introduces hierarchical convergence, where the low-level module rapidly converges within each cycle, and the high-level module updates only after this local equilibrium is reached. This enables nested computation and avoids premature convergence typical of standard RNNs.

Apply Image

Drag Post #6

elvis

@omarsar0

A 1-step gradient approximation sidesteps memory-intensive backpropagation-through-time (BPTT). This enables efficient training using only local gradient updates, grounded in deep equilibrium models.

Apply Image

Drag Post #7

elvis

@omarsar0

HRM implements adaptive computation time using a Q-learning-based halting mechanism, dynamically allocating compute based on task complexity. This allows the model to “think fast or slow” and scale at inference time without retraining.

Apply Image

Drag Post #8

elvis

@omarsar0

Experiments on ARC-AGI, Sudoku-Extreme, and Maze-Hard show that HRM significantly outperforms larger models using CoT or direct prediction, even solving problems that other models fail entirely (e.g., 74.5% on Maze-Hard vs. 0% for others).

Apply Image

Drag Post #9

elvis

@omarsar0

Analysis reveals that HRM learns a dimensionality hierarchy similar to the cortex: the high-level module operates in a higher-dimensional space than the low-level one (PR: 89.95 vs. 30.22). The authors suggest that this is an emergent trait not present in untrained models. Paper: <a target="_blank" href="https://arxiv.org/abs/2506.21734" color="blue">arxiv.org/abs/2506.21734</a>

Apply Image