@ArtificialAnlys: Mistral Medium 3 independent e...
@ArtificialAnlys
31 views
May 09, 2025
1
Mistral Medium 3 independent evals: Mistral is back amongst the leading non-reasoning models with Medium 3 rivalling Llama 4 Maverick, Gemini 2.0 Flash and Claude 3.7 Sonnet
Key takeaways:
➤ Intelligence: We see substantial intelligence gains across all 7 of our evals compared to @MistralAI Large 2. Medium 3 has especially made gains in Coding and Mathematical reasoning capabilities whereby it exceeds Llama 4 Maverick in both our Coding (LiveCodeBench, SciCode) and Math Index (AIME2024, MATH-500). The model performs well in our MMLU-Pro and GPQA evaluations but is closer to Llama 4 Scout than Maverick.
➤ Pricing: Alongside the intelligence increase offered vs. Mistral Large 2, Medium 3 offers a substantial price decrease. Mistral Medium 3 is priced at $0.4/$2 per 1M Input/Output tokens, a 80%/67% decrease in price vs. Mistral Large 2 ($2/$6).
➤ Proprietary: Mistral has not released the weights to the model but in their announcement post hinted at releasing “large” open weights models “over the next few weeks” by noting they’re “we’re excited to ‘open’ up what’s to come”.🕵️♂️
➤ Multimodal: The model has vision capabilities, and Mistral claims to be roughly in-line with Llama 4 Maverick. We have not verified this independently - we plan to be publishing vision evals soon.
See below for further analysis, including individual evaluation scores and its Intelligence vs. Price positioning.
Key takeaways:
➤ Intelligence: We see substantial intelligence gains across all 7 of our evals compared to @MistralAI Large 2. Medium 3 has especially made gains in Coding and Mathematical reasoning capabilities whereby it exceeds Llama 4 Maverick in both our Coding (LiveCodeBench, SciCode) and Math Index (AIME2024, MATH-500). The model performs well in our MMLU-Pro and GPQA evaluations but is closer to Llama 4 Scout than Maverick.
➤ Pricing: Alongside the intelligence increase offered vs. Mistral Large 2, Medium 3 offers a substantial price decrease. Mistral Medium 3 is priced at $0.4/$2 per 1M Input/Output tokens, a 80%/67% decrease in price vs. Mistral Large 2 ($2/$6).
➤ Proprietary: Mistral has not released the weights to the model but in their announcement post hinted at releasing “large” open weights models “over the next few weeks” by noting they’re “we’re excited to ‘open’ up what’s to come”.🕵️♂️
➤ Multimodal: The model has vision capabilities, and Mistral claims to be roughly in-line with Llama 4 Maverick. We have not verified this independently - we plan to be publishing vision evals soon.
See below for further analysis, including individual evaluation scores and its Intelligence vs. Price positioning.
4
Further analysis on Artificial Analysis:
artificialanalysis.ai/models?model-f…
artificialanalysis.ai/models?model-f…



