Visualize Thread by @andthatto

✨ Visual Editor

palette Canvas & Background

Presets

Custom Colors

Gradient:arrow_forward

Text Color:

Gradient Angle135°

Background Pattern

Grain Texture

Aspect Ratio

style Card Style

Preset

Padding40px

Card Radius16px

Enable Card Shadow

Glassmorphism Effect

Show Watermark AGENCY

Show Timestamps

Show X Logo

text_fields Typography

Font Family

Font Size16px

andthattoo

@andthatto

Qwen 3.6 is frontier for local.

It also thinks forever.

I tried a dumb inference-time trick: make its block obey a tiny grammar.

Result:
- HumanEval+: 22x fewer think tokens, no accuracy loss
- LiveCodeBench public slice: +14% pass@1, ~5x fewer total tokens

VIDEO

andthattoo

@andthatto

No finetuning.
Just GBNF-constrained decoding.

The constraint is applied only to the reasoning block, not the final answer/code.

andthattoo

@andthatto

On HumanEval+ with Qwen3.6-35B-A3B:

Free-form thinking:

92.1% pass@1
3087 mean think tokens

Grammar:

92.7% pass@1
138 mean think tokens

Same accuracy band.
~22x fewer thinking tokens.

andthattoo

@andthatto

Then I tried a recent LiveCodeBench v6 LeetCode slice.

Free-form: 50% pass@1 and 11553 mean think tokens
Grammar: 64% pass@1 and 267 mean think tokens

andthattoo

@andthatto

This is not “reasoning disappeared.”

On harder tasks, some reasoning moved into comments / post-think answer text.

Yet it reacts to how grammar is constructed.
I believe there may be task specific grammars discovered through @DSPyOSS style prompt optimization.

andthattoo

@andthatto

My insight is that a lot of verbose CoT is scaffolding, not essential computation.

Constrained decoding can force a denser interface to the model’s latent reasoning.

But if the task really needs more deliberation, it leaks somewhere else.

andthattoo

@andthatto

I think this is a useful middle ground between:

verbose CoT at inference
training models to reason in latent space

Just constrain the text interface.

Full writeup + results:

andthattoo.dev/blog/structure…

and repo: github.com/andthattoo/str…

Generated by Thread Navigator

100%

view_carousel Carousel Studio NEW

Press ⌘ + S to quick-export