✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
ARC Prize
@arcprize
Announcing ARC-AGI-3

The only unsaturated agentic intelligence benchmark in the world

Humans score 100%, AI <1%

This human-AI gap demonstrates we do not yet have AGI

Most benchmarks test what models already know, ARC-AGI-3 tests how they learn
ARC Prize
@arcprize
We build benchmarks that reveal the gap between what's easy for humans, hard for AI

ARC-AGI has repeatedly identified inflection points in AI progress, including the emergence of reasoning systems and the rise of capable AI agents.

ARC-AGI-3 is the next step in that journey
Thread image
ARC Prize
@arcprize
We created an in-house game studio and built 135 novel environments from scratch

No instructions, Core Knowledge Priors-only

In order to beat these games, AI must:
• Explore the environment
• Form hypotheses
• Execute a plan
• Learn and adapt
ARC Prize
@arcprize
ARC-AGI-3 is a useful research tool to analyze model behavior

Key failure modes seen in our early testing:
• Thinking it is playing another game
• Holding on to early hypothesis
• Unable to forecast into the future

Both AI + human runs have sharable replays

Watch Gemini 3.1 do well on some games, poorly on others:
arcprize.org/replay/34a9614…
arcprize.org/replay/d0e0768…
ARC Prize
@arcprize
ARC-AGI-3 gives us a formal measure to compare human and AI skill acquisition efficiency

Humans don’t brute force - they build mental models, test ideas, and refine quickly

How close AI is to that? (Spoiler: not close)
Thread image
ARC Prize
@arcprize
Also live today: ARC Prize 2026 - 3 tracks, $2,000,000 in prizes available!

Get involved:
• Play a Game: arcprize.org/tasks/ls20
• Build Agents: docs.arcprize.org
• Win Prizes: arcprize.org/competitions/2…
Thread image
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export