Hi,๐Ÿ‘‹ we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. ๐Ÿ˜Š

@TheTuringPost: Absolute Zero is a new paradig...

@TheTuringPost
11 views May 13, 2025
1
Absolute Zero is a new paradigm from @Tsinghua_Uni that encourages models to learn without human-labeled data.

It's a self-play process, where the model is both a proposer and a solver.

- A model creates its own tasks to learn from.
- It solves these tasks on its own, using feedback from an environmental tool.

Based in this, researchers also built the Absolute Zero Reasoner (AZR) system.

This paradigm shows that you don't need thousands of outside data examples or human guidance to get SOTA results.

Details ๐Ÿงต
Media image
2
1. Roles and rewards in Absolute Zero:

The model plays 2 roles:

- A proposer: It invents a new reasoning task.
- A solver: It tries to solve that task.

An environment tool checks if the task makes sense and provides the right answer. The model then tries to answer the task. If it does well, it gets rewarded.

There are 2 types of feedback:
- One for coming up with a good, learnable task.
- Another for solving it correctly.
Media image
3
2. The Absolute Zero Reasoner (AZR) is a first working system that fully uses the Absolute Zero paradigm.

You only need a single very simple "return" program to kickstart AZR's self-training loop.
Media image
4
3. AZR uses code problems as its main learning tool.

- It creates and solves a set of coding tasks based on past tasks it already made and solved, and the type of reasoning it wants to practice (deduction, abduction, or induction).
- Python is used to check if the tasks are valid and then, if the model's answers are correct.
- AZR uses 2 scores for training: for proposing good tasks and for solving them.
Media image
5
4. Even though AZR was trained without any human-written data, its has impressive results:

- It beat the best "zero-data" models by +1.8%
- AZR improved its math score by +15.2% vs. +0.65% of other top models

Also:
- Bigger models learn more
- Code helps general reasoning (even in math)
- Emergent planning: AZR starts writing step-by-step explanations as comments.
Media image
Actions
Visual Editor Carousel Maker NEW
Update Thread
What You Can Do
  • Download as PDF
  • Save to Notion
  • Export as Markdown
  • Visual Editor
  • LinkedIn & Instagram Carousel Maker
Create Free Account

Includes 7-day Premium trial