| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.8 | Anthropic | Closed | 87.9% | |
| 2 | GPT-5.4 | OpenAI | Closed | 85.4% | |
| 3 | Gemini 3.1 Pro | Closed | 84.4% | ||
| 4 | Muse Spark | Meta | Closed | 84.1% | |
| 5 | Claude Opus 4.6 | Anthropic | Closed | 83.1% | |
| 6 | Gemini 3 Pro | Closed | 72.7% | ||
| 7 | Holo2-235B-A22B | H Company | Open | 70.6% | |
| 8 | Qwen3.6 Plus | Alibaba | Closed | 68.2% | |
| 9 | Holo2-30B-A3B | H Company | Open | 66.1% | |
| 10 | Qwen3.5 397B | Alibaba | Open | 65.6% | |
| 11 | Holo2-8B | H Company | Open | 58.9% | |
| 12 | Nemotron 3 Nano Omni 30B A3B | NVIDIA | Open | 57.8% | |
| 13 | Holo2-4B | H Company | Open | 57.2% | |
| 14 | Claude Opus 4.5 | Anthropic | Closed | 45.7% |