Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

@TheAhmadOsman: INCREDIBLESomeone on r/Local...

@TheAhmadOsman
5 views Jan 22, 2026
1
INCREDIBLE

Someone on r/LocalLLaMA did an incredibly practical thing

They took a tiny 0.6B model that was trash at task (Text2SQL)
Created a knowledge distiliation agent with a Claude Code skill
And made the 0.6B model behave like a specialist using 100 examples

The problem
> Small Language Models are “generally helpful”
> but specialized tasks are “exact or you die”
> you ask: “Which artists have >1M album sales?”
> the model answers: “check if genre is NULL”

The old way to fix this
> Finetune the model:
> collect + clean data
> build training pipeline
> tune hparams
> rerun when it’s wrong
> accidentally become the unpaid
> intern of your own experiment

The new way
> Knowledge distillation via a Claude skill
> use a strong teacher (DeepSeek-V3)
> generate synthetic pairs from a small seed set
> train a tiny student to imitate the teacher on your task
> ship it as GGUF / HF / LoRA
> run it locally

Distillation isn’t “creating skill”
It’s compressing skill

THE REAL HACK: agent-as-interface
> They wrapped the whole distillation loop in an agent “skill”:
> picks task type (QA / classification / tool calling / RAG)
> converts messy inputs into clean JSONL
> runs teacher eval first
> kicks off distillation + monitors progress
> packages weights for you to run locally
This is the quiet unlock

Why “teacher eval first” is elite behavior
> distillation amplifies competence and incompetence
> if the teacher is wrong, the student learns wrong faster
> garbage in -> efficient garbage out
Adult supervision, but for models

The run breakdown:
> seed: ~100 raw conversation traces
> teacher (LLM-as-judge): ~80%
> base 0.6B: ~36%
> distilled 0.6B: ~74%
> output: ~2.2GB GGUF
> runs locally with llama.cpp

Before vs after (the entire reason you do this)
> before: wrong tables, wrong logic, nonsense SQL
> after: correct JOINs, GROUP BY, HAVING
> aka “this query actually executes and answers the question”

What this really means (bigger than Text2SQL)
You don’t need a giant model for every job

You need tiny specialists that understand your world:
> internal schemas
> service / OS logs
> tool outputs
> company-specific workflows

TL;DR
> “fine-tuning is hard” is mostly “the pipeline is annoying”
> distillation skill turns 10–100 examples into a real specialist
> the agent wrapper turns the whole thing into a conversation
> this is how you get practical local SLMs
> without becoming an MLOps monk

Small & Specialized models
> High-leverage
> Boringly effective
> Exactly where this is going

The future is
Local inference
Lower latency
Fewer secrets leaving the building
Media image
Actions
Visual Editor Carousel Maker NEW
Update Thread
What You Can Do
  • Download as PDF
  • Save to Notion
  • Export as Markdown
  • Visual Editor
  • LinkedIn & Instagram Carousel Maker
Create Free Account

Includes 7-day Premium trial