Let me demonstrate the true power of llama.cpp:
- Running on Mac Studio M2 Ultra (3 years old)
- Gemma 4 26B A4B Q8_0 (full quality)
- Built-in WebUI (ships with llama.cpp)
- MCP support out of the box (web-search, HF, github, etc.)
- Prompt speculative decoding
The result: 300t/s
(realtime video...
Apr 05, 2026