| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | gpt-5.5[xhigh] | OpenAI | Closed | 70% | |
| 2 | gpt-5.4[xhigh] | OpenAI | Closed | 56% | |
| 3 | claude-opus-4.7[max] | Anthropic | Closed | 54% | |
| 4 | claude-sonnet-4.6[high] | Anthropic | Closed | 32% | |
| 5 | gemini-3.5-flash[medium] | Closed | 28% | ||
| 6 | gpt-5.4-mini[xhigh] | OpenAI | Closed | 24% | |
| 7 | kimi-k2.6 | Moonshot AI | Open | 24% | |
| 8 | mimo-v2.5-pro | Xiaomi | Closed | 19% | |
| 9 | glm-5.1 | Z.AI | Open | 18% | |
| 10 | gemini-3.1-pro | Closed | 10% | ||
| 11 | deepseek-v4-pro | DeepSeek | Open | 8% | |
| 12 | gemini-3-flash | Closed | 5% |