| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.6 | Anthropic | Closed | 65.3% | |
| 2 | GLM-5 | Z.AI | Open | 62.8% | |
| 3 | GLM-5.1 | Z.AI | Open | 62.7% | |
| 4 | DeepSeek V3.2 | DeepSeek | Open | 60.9% | |
| 5 | Claude Sonnet 4.6 | Anthropic | Closed | 60.7% | |
| 6 | Qwen3.5-27B | Alibaba | Open | 58.9% | |
| 7 | GLM-4.7 | Z.AI | Open | 58.7% | |
| 8 | Kimi K2.5 | Moonshot AI | Open | 58.5% | |
| 9 | GPT-5.3 Codex | OpenAI | Closed | 58.2% | |
| 10 | Composer 2 | Cursor | Closed | 58% | |
| 11 | Qwen3.5-35B-A3B | Alibaba | Open | 53.7% | |
| 12 | MiniMax M2.7 | MiniMax | Open | 51.9% | |
| 13 | Gemma 4 31B | Open | 41.6% |