| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | o1 | OpenAI | Closed | 91.8% | |
| 2 | GPT-4.1 | OpenAI | Closed | 90.2% | |
| 3 | DeepSeek V4 Pro Base | DeepSeek | Open | 90.1% | |
| 4 | DeepSeek V4 Flash Base | DeepSeek | Open | 88.7% | |
| 5 | GPT-4.1 mini | OpenAI | Closed | 87.5% | |
| 6 | Trinity-Large-Preview | Arcee AI | Open | 87.2% | |
| 7 | o3-mini | OpenAI | Closed | 86.9% | |
| 8 | GPT-4.1 nano | OpenAI | Closed | 80.1% |