Ahmad (@TheAhmadOsman)

View on X 2 Unrolled Threads

💻 Tech & Development🤖 AI & Machine Learning🎨 Design & UX

> <b>You don't pick an inference engine first. You pick a hardware strategy, a workload shape, and a serving model. The engine follows.</b>...

Jun 21, 2026

Thread Archive

INCREDIBLE Someone on r/LocalLLaMA did an incredibly practical thing They took a tiny 0.6B model that was trash at task (Text2SQL) Created a knowledge distiliation agent with a Claude Code skill And made the 0.6B model behave like a specialist using 100 examples The problem > Small Language Model...

Jan 22, 2026