| # | Model | Provider | Type | Score | |
|---|---|---|---|---|---|
| 1 | Step 3.7 Flash | StepFun | Open | 79.2% | |
| 2 | Gemini 3.1 Pro | Closed | 72.4% | ||
| 3 | Muse Spark | Meta | Closed | 71.3% | |
| 4 | GPT-5.4 | OpenAI | Closed | 61.1% | |
| 5 | Qwen3.6-35B-A3B | Alibaba | Open | 58.9% | |
| 6 | Grok 4.20 | xAI | Closed | 57.4% | |
| 7 | Qwen3.6-27B | Alibaba | Open | 56.1% |