| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | Claude Mythos Preview | Anthropic | Closed | 83.1% | |
| 2 | GPT-5.5 | OpenAI | Closed | 81.8% | |
| 3 | GPT-5.4 | OpenAI | Closed | 79.0% | |
| 4 | Claude Opus 4.7 (Adaptive) | Anthropic | Closed | 73.1% | |
| 5 | GLM-5.1 | Z.AI | Open | 68.7% | |
| 6 | Claude Opus 4.6 | Anthropic | Closed | 66.6% | |
| 7 | Claude Sonnet 4.6 | Anthropic | Closed | 65.2% | |
| 8 | Claude Opus 4.5 | Anthropic | Closed | 50.6% | |
| 9 | Muse Spark | Meta | Closed | 43.5% | |
| 10 | GLM-5 | Z.AI | Open | 43.2% |