โœจ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135ยฐ

style Card Style

40px
16px

text_fields Typography

16px
SemiAnalysis
@SemiAnalysis_
People think AI inference margins are a race to the bottom. Anthropic's gross margins were -94% in 2024. MiniMax was -25%. The narrative made sense (1/5)๐Ÿงต
Thread image
SemiAnalysis
@SemiAnalysis_
Then something changed. Zhipu raised prices 30% in February 2026, the first hike in China's AI market. It sold out instantly. ARR went 25x in 10 months. (2/5)
SemiAnalysis
@SemiAnalysis_
The secret is interactivity: tokens per second per user. It's the dial labs slide between margin and user happiness. Customer requirements depend on the workload, and throughput and costs depend on the hardware. At SemiAnalysis, we think Inference Provider Gross Margins should blend to ~60%. The chart below shows how outcomes vary significantly across hardware. (3/5)
Thread image
SemiAnalysis
@SemiAnalysis_
We know interactivity matters. Moonshot tried aggressive batching to cut costs. Users left. They added a premium tier. DeepSeek lost share serving their own model the same way. (4/5)
SemiAnalysis
@SemiAnalysis_
AI inference isn't a commodity. It's a managed experience. Labs that understand the interactivity lever operate at 60%+ margins. The rest race to zero. (5/5)
Thread image
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press โŒ˜ + S to quick-export