| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | Gemini 3.1 Pro | Closed | 94.3% | ||
| 2 | Claude Opus 4.7 (Adaptive) | Anthropic | Closed | 94.2% | |
| 3 | Claude Opus 4.8 | Anthropic | Closed | 93.6% | |
| 4 | GPT-5.5 | OpenAI | Closed | 93.6% | |
| 5 | GPT-5.4 | OpenAI | Closed | 92.8% | |
| 6 | Gemini 3.5 Flash | Closed | 92.7% | ||
| 7 | Qwen3.7 Max | Alibaba | Closed | 92.4% | |
| 8 | Kimi K2.6 | Moonshot AI | Open | 90.5% | |
| 9 | DeepSeek V4 Pro (Max) | DeepSeek | Open | 90.1% | |
| 10 | Interfaze Beta | Interfaze | Closed | 89.9% | |
| 11 | Muse Spark | Meta | Closed | 89.5% | |
| 12 | Claude Opus 4.6 | Anthropic | Closed | 89.2% | |
| 13 | DeepSeek V4 Pro (High) | DeepSeek | Open | 89.1% | |
| 14 | Grok 4.20 | xAI | Closed | 88.5% | |
| 15 | DeepSeek V4 Flash (Max) | DeepSeek | Open | 88.1% | |
| 16 | Kimi K2.5 | Moonshot AI | Open | 87.6% | |
| 17 | DeepSeek V4 Flash (High) | DeepSeek | Open | 87.4% | |
| 18 | Hy3 Preview | Tencent | Open | 87.2% | |
| 19 | MiniMax M2.7 | MiniMax | Open | 87.0% | |
| 20 | GLM-5.1 | Z.AI | Open | 86.2% | |
| 21 | GLM-5 | Z.AI | Open | 86.0% | |
| 22 | Trinity-Large-Thinking | Arcee AI | Open | 76.3% | |
| 23 | DeepSeek V4 Pro | DeepSeek | Open | 72.9% | |
| 24 | Nemotron 3 Nano Omni 30B A3B | NVIDIA | Open | 72.2% | |
| 25 | DeepSeek V4 Flash | DeepSeek | Open | 71.2% | |
| 26 | ZAYA1-8B | Zyphra | Open | 71.0% | |
| 27 | Trinity-Large-Preview | Arcee AI | Open | 63.3% | |
| 28 | ZAYA1-74B-Preview | Zyphra | Open | 57.3% | |
| 29 | MiniCPM5-1B | OpenBMB | Open | 26.3% |