| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | GPT-5.5 | OpenAI | Closed | 85% | |
| 2 | GPT-5.4 Pro | OpenAI | Closed | 83.3% | |
| 3 | Gemini 3.1 Pro | Closed | 77.1% | ||
| 4 | Claude Opus 4.7 (Adaptive) | Anthropic | Closed | 75.8% | |
| 5 | Gemini 3.5 Flash | Closed | 72.1% | ||
| 6 | Grok 4.20 | xAI | Closed | 53.3% | |
| 7 | GPT-5.2 | OpenAI | Closed | 52.9% | |
| 8 | Gemini 3 Pro Deep Think | Closed | 45.1% | ||
| 9 | Muse Spark | Meta | Closed | 42.5% | |
| 10 | Gemini 3 Pro | Closed | 31.1% | ||
| 11 | Claude Sonnet 4.5 | Anthropic | Closed | 13.6% |