context.vn

mmlu Pro

36 models evaluated

#ModelProviderTypeScore
1Qwen3.7 MaxAlibabaClosed89.6%
2Claude Opus 4.5AnthropicClosed89.5%
3Qwen3.6 PlusAlibabaClosed88.5%
4Qwen3.5 397BAlibabaOpen87.8%
5DeepSeek V4 Pro (Max)DeepSeekOpen87.5%
6DeepSeek V4 Pro (High)DeepSeekOpen87.1%
7Kimi K2.5 (Reasoning)Moonshot AIClosed87.1%
8Kimi K2.5Moonshot AIOpen87.1%
9Qwen3.5-122B-A10BAlibabaOpen86.7%
10DeepSeek V4 Flash (High)DeepSeekOpen86.4%
11DeepSeek V4 Flash (Max)DeepSeekOpen86.2%
12Qwen3.6-27BAlibabaOpen86.2%
13Qwen3.5-27BAlibabaOpen86.1%
14GLM-5Z.AIOpen85.7%
15Qwen3.5-35B-A3BAlibabaOpen85.3%
16Qwen3.6-35B-A3BAlibabaOpen85.2%
17Gemma 4 31BGoogleOpen85.2%
18MiMo-V2-FlashXiaomiOpen84.9%
19GLM-4.7Z.AIOpen84.3%
20DeepSeek V4 FlashDeepSeekOpen83%
21Qwen3 235B 2507AlibabaOpen83%
22DeepSeek V4 ProDeepSeekOpen82.9%
23Gemma 4 26B A4BGoogleOpen82.6%
24Claude Opus 4.6AnthropicClosed82%
25Exaone 4.0 32BLG AI ResearchOpen81.8%
26Claude Sonnet 4.6AnthropicClosed79.2%
27Nemotron 3 Nano Omni 30B A3BNVIDIAOpen77.3%
28DeepSeek V3DeepSeekOpen75.9%
29ZAYA1-8BZyphraOpen74.2%
30DeepSeek V4 Pro BaseDeepSeekOpen73.5%
31Gemma 4 E4BGoogleOpen69.4%
32DeepSeek V4 Flash BaseDeepSeekOpen68.3%
33ZAYA1-74B-PreviewZyphraOpen68.1%
34Gemma 4 E2BGoogleOpen60%
35MiniCPM5-1BOpenBMBOpen48.9%
36LFM2.5-VL-450MLiquidAIOpen19.3%