context.vn

swe Pro

35 models evaluated

#ModelProviderTypeScore
1Claude Mythos PreviewAnthropicClosed77.8%
2Claude Opus 4.8AnthropicClosed69.2%
3Claude Opus 4.7 (Adaptive)AnthropicClosed64.3%
4Qwen3.7 MaxAlibabaClosed60.6%
5MiniMax M3MiniMaxOpen59%
6GPT-5.5OpenAIClosed58.6%
7Kimi K2.6Moonshot AIOpen58.6%
8GLM-5.1Z.AIOpen58.4%
9GPT-5.4OpenAIClosed57.7%
10Qwen 3.6 Max (preview)AlibabaClosed57.3%
11MiMo-V2.5-ProXiaomiClosed57.2%
12Claude Opus 4.5AnthropicClosed57.1%
13GPT-5.3 CodexOpenAIClosed56.8%
14Qwen3.6 PlusAlibabaClosed56.6%
15Step 3.7 FlashStepFunOpen56.3%
16MiniMax M2.7MiniMaxOpen56.2%
17MiMo-V2.5XiaomiClosed56.1%
18GPT-5.2OpenAIClosed55.6%
19DeepSeek V4 Pro (Max)DeepSeekOpen55.4%
20Gemini 3.5 FlashGoogleClosed55.1%
21GLM-5Z.AIOpen55.1%
22DeepSeek V4 Pro (High)DeepSeekOpen54.4%
23Qwen3.6-27BAlibabaOpen53.5%
24Claude Opus 4.6AnthropicClosed53.4%
25DeepSeek V4 Flash (Max)DeepSeekOpen52.6%
26Muse SparkMetaClosed52.4%
27DeepSeek V4 Flash (High)DeepSeekOpen52.3%
28DeepSeek V4 ProDeepSeekOpen52.1%
29Grok 4.20xAIClosed51.8%
30Qwen3.5 397BAlibabaOpen50.9%
31Kimi K2.5Moonshot AIOpen50.7%
32Qwen3.6-35B-A3BAlibabaOpen49.5%
33Laguna M.1PoolsideClosed49.2%
34DeepSeek V4 FlashDeepSeekOpen49.1%
35Laguna XS.2PoolsideOpen46.3%