Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊
@ArtificialAnlys

Artificial Analysis (@ArtificialAnlys)

View on X 5 Unrolled Threads
Thread Archive
32
🤖 AI & Machine Learning

Claude Fable 5 cost ~$6.2K to run the Artificial Analysis Intelligence Index benchmarks - the most expensive model we have ever benchmarked 🧵 Key takeaways: ➤ Intelligence Index: 60, ahead of Claude Opus 4.8 (56) and GPT-5.5 (55) ➤ Cost to run the Intelligence Index: $6.2K, 1.7× the next-highest ...

Jun 18, 2026
Thread Archive
28

We’ve added a new pseudonymous video model to our Text to Video and Image to Video Arenas.‘HappyHorse-1.0’ is currently landing in the #1 spot for Text and Image to Video (No Audio) and the #2 spot for Text and Image to Video (With Audio). Further details coming soon. Example generations below fro...

Apr 10, 2026
Thread Archive
42

DeepSeek V3.2 is the #2 most intelligent open weights model and also ranks ahead of Grok 4 and Claude Sonnet 4.5 (Thinking) - it takes DeepSeek Sparse Attention out of ‘experimental’ status and couples it with a material boost to intelligence @deepseek_ai V3.2 scores 66 on the Artificial Analysis I...

Dec 03, 2025
Thread Archive
45

Qwen3 model family overview: full benchmarks for all 8 Qwen3 models in both reasoning and non-reasoning modes Key results: ➤ Qwen3 235B-A22B (Reasoning): The largest Qwen3 model scores 62 on the Artificial Analysis Intelligence Index, becoming the most intelligent open weights model ever. This is v...

May 13, 2025
Thread Archive
31

Mistral Medium 3 independent evals: Mistral is back amongst the leading non-reasoning models with Medium 3 rivalling Llama 4 Maverick, Gemini 2.0 Flash and Claude 3.7 Sonnet Key takeaways: ➤ Intelligence: We see substantial intelligence gains across all 7 of our evals compared to @MistralAI Large 2...

May 09, 2025