Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

@ArtificialAnlys: Mistral Medium 3 independent e...

@ArtificialAnlys
31 views May 09, 2025
1
Mistral Medium 3 independent evals: Mistral is back amongst the leading non-reasoning models with Medium 3 rivalling Llama 4 Maverick, Gemini 2.0 Flash and Claude 3.7 Sonnet

Key takeaways:
➤ Intelligence: We see substantial intelligence gains across all 7 of our evals compared to @MistralAI Large 2. Medium 3 has especially made gains in Coding and Mathematical reasoning capabilities whereby it exceeds Llama 4 Maverick in both our Coding (LiveCodeBench, SciCode) and Math Index (AIME2024, MATH-500). The model performs well in our MMLU-Pro and GPQA evaluations but is closer to Llama 4 Scout than Maverick.
➤ Pricing: Alongside the intelligence increase offered vs. Mistral Large 2, Medium 3 offers a substantial price decrease. Mistral Medium 3 is priced at $0.4/$2 per 1M Input/Output tokens, a 80%/67% decrease in price vs. Mistral Large 2 ($2/$6).
➤ Proprietary: Mistral has not released the weights to the model but in their announcement post hinted at releasing “large” open weights models “over the next few weeks” by noting they’re “we’re excited to ‘open’ up what’s to come”.🕵️‍♂️
➤ Multimodal: The model has vision capabilities, and Mistral claims to be roughly in-line with Llama 4 Maverick. We have not verified this independently - we plan to be publishing vision evals soon.

See below for further analysis, including individual evaluation scores and its Intelligence vs. Price positioning.
Media image
2
Intelligence vs. Price positioning: A substantial improvement across both dimensions, lower price and higher intelligence, compared to Mistral's Large 2
Media image
3
Token usage and efficiency: Medium 3 uses substantially more tokens, due to more verbose responses, than Mistral Large 2 to run our Artificial Analysis Intelligence Index
Media image
4
Further analysis on Artificial Analysis:
artificialanalysis.ai/models?model-f…
5
Individual evaluation results (all run independently):
Media image
Actions
Visual Editor Carousel Maker NEW
Update Thread
What You Can Do
  • Download as PDF
  • Save to Notion
  • Export as Markdown
  • Visual Editor
  • LinkedIn & Instagram Carousel Maker
Create Free Account

Includes 7-day Premium trial