@arcprize: Announcing ARC-AGI-3The only...

1

Announcing ARC-AGI-3

The only unsaturated agentic intelligence benchmark in the world

Humans score 100%, AI <1%

This human-AI gap demonstrates we do not yet have AGI

Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

2

We build benchmarks that reveal the gap between what's easy for humans, hard for AI

ARC-AGI has repeatedly identified inflection points in AI progress, including the emergence of reasoning systems and the rise of capable AI agents.

ARC-AGI-3 is the next step in that journey

3

We created an in-house game studio and built 135 novel environments from scratch

No instructions, Core Knowledge Priors-only

In order to beat these games, AI must:
• Explore the environment
• Form hypotheses
• Execute a plan
• Learn and adapt

4

ARC-AGI-3 is a useful research tool to analyze model behavior

Key failure modes seen in our early testing:
• Thinking it is playing another game
• Holding on to early hypothesis
• Unable to forecast into the future

Both AI + human runs have sharable replays

Watch Gemini 3.1 do well on some games, poorly on others:
arcprize.org/replay/34a9614…
arcprize.org/replay/d0e0768…

5

ARC-AGI-3 gives us a formal measure to compare human and AI skill acquisition efficiency

Humans don’t brute force - they build mental models, test ideas, and refine quickly

How close AI is to that? (Spoiler: not close)

6

Also live today: ARC Prize 2026 - 3 tracks, $2,000,000 in prizes available!

Get involved:
• Play a Game: arcprize.org/tasks/ls20
• Build Agents: docs.arcprize.org
• Win Prizes: arcprize.org/competitions/2…

@arcprize: Announcing ARC-AGI-3The only...

Actions

What You Can Do