Models
Comparative table of the 128 frontier models × 31 benchmarks. Click any header to sort. Heatmap per column (red = worst of filtered set, green = best). Frontier Index ranking on home.
36 models · 31 benchmarks
Categories:(all — click to filter)
| Model ⇅ | MMLU⇅ | MMLU-Pro⇅ | GPQA-Diamond⇅ | BBH⇅ | ARC-AGI-2⇅ | Humanitys-Last-Exam⇅ | MMMU⇅ | HumanEval⇅ | MBPP+⇅ | SWE-bench-Verified⇅ | SWE-bench-Pro⇅ | CyberGym⇅ | LiveCodeBench⇅ | Aider-polyglot⇅ | Terminal-Bench-Hard⇅ | Terminal-Bench-2⇅ | MATH-500⇅ | AIME-2024⇅ | AIME-2025⇅ | GSM8K⇅ | FrontierMath⇅ | SimpleQA⇅ | IFEval⇅ | Arena-Hard⇅ | MGSM⇅ | TAU-bench⇅ | OSWorld⇅ | BrowseComp⇅ | GDPval⇅ | Arena-ELO⇅ | LiveBench⇅ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GPT-5.5 OpenAI · 2026-04 | — | — | 93.6 | — | 85.0 | — | — | — | — | — | 58.6 | — | — | — | — | 82.7 | — | — | — | — | 35.4 | — | — | — | — | — | 78.7 | 84.4 | 84.9 | — | — |
Claude Fable 5 Anthropic · 2026-06 | — | — | — | — | — | 53.0 | — | — | — | 95.0 | 80.0 | — | — | — | — | 84.3 | — | — | — | — | — | — | — | — | — | — | 85.0 | — | — | — | — |
Claude Opus 4.8 Anthropic · 2026-05 | — | — | 93.6 | — | — | 49.8 | — | — | — | 88.6 | 69.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 83.4 | — | — | — | — |
Claude Sonnet 4.6 Anthropic · 2026-02 | — | — | 89.9 | — | 60.4 | — | — | — | — | 79.6 | — | — | — | — | — | — | — | — | 95.6 | — | — | — | — | — | — | — | 72.5 | — | — | — | — |
Gemini 3.1 Pro Google DeepMind · 2026-02 | — | — | 94.3 | — | 77.1 | 44.4 | — | — | — | 80.6 | — | — | — | — | — | 68.5 | — | — | — | — | — | — | — | — | — | — | — | 85.9 | — | — | — |
Grok 4.3 xAI · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Grok 4.20 xAI · 2026-03 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Muse Spark Meta · 2026-04 | — | — | — | — | — | 58.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Mistral Medium 3.5 Mistral AI · 2026-04 | — | — | — | — | — | — | — | — | — | 77.6 | — | — | — | — | — | — | — | — | 86.3 | — | — | — | — | — | — | — | — | — | — | — | — |
Command A+ Cohere · 2026-05 | — | — | — | — | — | — | 75.1 | — | — | — | — | — | — | — | 25.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Reka Flash 3.1open Reka · 2025-07 | — | — | — | — | — | — | — | — | — | — | — | — | 53.5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Jamba 1.7 Largeopen AI21 Labs · 2025-07 | — | 57.7 | 39.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
DeepSeek V4 Proopen DeepSeek · 2026-04 | — | 87.5 | 90.1 | — | — | 37.7 | — | 76.8 | — | 80.6 | 55.4 | — | 93.5 | — | — | — | — | — | — | 92.6 | — | — | — | — | — | — | — | — | — | — | — |
Qwen3.7-Max Alibaba · 2026-05 | — | 89.6 | 92.4 | — | — | 41.4 | — | — | — | 80.4 | 60.6 | — | 91.6 | — | — | 69.7 | — | — | — | — | — | — | 94.3 | — | — | — | — | — | — | — | — |
GLM-5.2open Zhipu AI · 2026-06 | — | — | 91.2 | — | — | 40.5 | — | — | — | — | 62.1 | — | — | — | — | 81.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
GLM-5 Zhipu AI · 2026-02 | — | — | 86.0 | — | — | 30.5 | — | — | — | 77.8 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
GLM-5.1open Zhipu AI · 2026-03 | — | — | 86.2 | — | — | 31.0 | — | — | — | 77.8 | 58.4 | — | — | — | — | 63.5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
ERNIE 5.1 Baidu · 2026-05 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
ERNIE 5.0 Baidu · 2026-01 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Doubao Seed 2.0 Pro ByteDance · 2026-02 | — | 87.0 | 88.9 | — | — | — | 85.4 | — | — | 76.5 | — | — | 87.8 | — | — | — | — | — | 93.3 | — | — | — | — | — | — | — | — | — | — | — | — |
MiMo V2.5 Pro Xiaomi · 2026-04 | — | — | — | — | — | — | — | — | — | 78.9 | 57.2 | — | — | — | — | 68.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
MiMo V2.5open Xiaomi · 2026-04 | — | — | — | — | — | — | — | — | — | — | 56.1 | — | — | — | — | 65.8 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
MiniMax M3open MiniMax · 2026-06 | — | — | — | — | — | — | — | — | — | — | 59.0 | — | — | — | — | 66.0 | — | — | — | — | — | — | — | — | — | — | 70.1 | 83.5 | — | — | — |
Nemotron 3 Ultra 550B-A55B Nvidia · 2026-06 | — | 86.8 | 87.0 | — | — | 26.7 | — | — | — | 71.9 | — | — | 89.0 | — | — | 54.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Nemotron 3 Superopen Nvidia · 2026-03 | — | 83.7 | 79.2 | — | — | 18.3 | — | — | — | 60.5 | — | — | 81.2 | — | — | — | — | — | 90.2 | — | — | — | — | 73.9 | — | — | — | — | — | — | — |
AFM Server Apple · 2025-07 | 80.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 89.1 | — | — | — | — | — | — | — | — |
Amazon Nova 2 Omni Amazon · 2025-12 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Nova 2 Pro Amazon · 2025-12 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Samsung Gauss 2.3 Samsung · 2025-09 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Kimi K2.7-Codeopen Moonshot AI · 2026-06 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Kimi K2.6open Moonshot AI · 2026-04 | — | — | 90.5 | — | — | 34.7 | — | — | — | 80.2 | 58.6 | — | 89.6 | — | — | 66.7 | — | — | — | — | — | — | — | — | — | — | 73.1 | 83.2 | — | — | — |
EXAONE 4.5 33Bopen LG AI Research · 2026-04 | — | 83.3 | 80.5 | — | — | — | — | — | — | — | — | — | 81.4 | — | — | — | — | — | 92.9 | — | — | — | — | — | — | — | — | — | — | — | — |
K-EXAONE 236B-A23Bopen LG AI Research · 2026-01 | — | 83.8 | 79.1 | — | — | 13.6 | — | — | — | 49.4 | — | — | 80.7 | — | — | — | — | — | 92.8 | — | — | — | — | — | — | — | — | — | — | — | — |
Hunyuan Hy3-previewopen Tencent · 2026-04 | — | — | 87.2 | — | — | — | — | — | — | 74.4 | — | — | — | — | — | 54.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Ring-2.6-1Topen Ant Group · 2026-05 | — | — | 88.3 | — | 66.2 | — | — | — | — | 74.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Ling-2.6-1Topen Ant Group · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |