Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Georgi Gerganov
@ggerganov
Let me demonstrate the true power of llama.cpp:

- Running on Mac Studio M2 Ultra (3 years old)
- Gemma 4 26B A4B Q8_0 (full quality)
- Built-in WebUI (ships with llama.cpp)
- MCP support out of the box (web-search, HF, github, etc.)
- Prompt speculative decoding

The result: 300t/s

(realtime video)
Video thumbnail
VIDEO
Georgi Gerganov
@ggerganov
Of course, this is a trivial example of prompt-based speculative decoding, because the model recites sections from what is already in the prompt (so don't get too excited 😉).

Still it's a nice and quick showcase of some of llama.cpp capabilities
Georgi Gerganov
@ggerganov
llama.cpp with it's integrated WebUI is effectively the most lightweight and self-contained agent that you can run locally.

Here are a few more examples of using @huggingface MCP to search for models
Video thumbnail
VIDEO
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press ⌘ + S to quick-export