context.vn

aa Mmmu Pro

68 models evaluated

#ModelProviderTypeScore
1Gemini 3.5 FlashGoogleClosed84.3%
2Gemini 3.1 ProGoogleClosed82.4%
3Muse SparkMetaClosed80.5%
4Gemini 3 ProGoogleClosed80.2%
5GPT-5.5OpenAIClosed79.9%
6Kimi K2.6Moonshot AIOpen79.4%
7Claude Opus 4.7 (Adaptive)AnthropicClosed78.8%
8Gemini 3 FlashGoogleClosed78.6%
9GPT-5.3 CodexOpenAIClosed78.5%
10GPT-5.4OpenAIClosed78.4%
11Grok 4.3xAIClosed78.1%
12Qwen3.6 PlusAlibabaClosed78.0%
13Qwen3.5 397B (Reasoning)AlibabaOpen77.3%
14Claude Opus 4.7AnthropicClosed76.4%
15GPT-5.2-CodexOpenAIClosed76.3%
16GPT-5.1OpenAIClosed75.5%
17Gemini 3.1 Flash-LiteGoogleClosed75.5%
18Kimi K2.5 (Reasoning)Moonshot AIClosed75.4%
19Kimi K2.5Moonshot AIOpen75.4%
20Claude Opus 4.6 (Adaptive)AnthropicClosed75.4%
21Qwen3.6-35B-A3BAlibabaOpen75.0%
22Qwen3.5-122B-A10BAlibabaOpen75.0%
23Qwen3.5-27BAlibabaOpen75.0%
24Gemini 2.5 ProGoogleClosed74.9%
25Qwen3.6-27BAlibabaOpen74.6%
26GPT-5 (medium)OpenAIClosed74.3%
27GPT-5 (high)OpenAIClosed74.2%
28Claude Opus 4.5 ThinkingAnthropicClosed74.0%
29Gemma 4 31BGoogleOpen73.4%
30GPT-5.4 miniOpenAIClosed73.3%
31GLM-5V-TurboZ.AIClosed72.8%
32Qwen3.5-35B-A3BAlibabaOpen72.7%
33Claude Opus 4.6AnthropicClosed72.5%
34GPT-5.1-Codex-MaxOpenAIClosed72.5%
35GPT-5.1-CodexOpenAIClosed72.5%
36Claude Opus 4.5AnthropicClosed71.2%
37Claude Sonnet 4.6AnthropicClosed70.6%
38o3OpenAIClosed70.1%
39MiMo-V2-OmniXiaomiClosed69.9%
40Gemma 4 26B A4BGoogleOpen69.2%
41Grok 4xAIClosed68.8%
42Claude 4.1 Opus ThinkingAnthropicClosed67.9%
43Gemini 2.5 FlashGoogleClosed65.5%
44GPT-5.4 nanoOpenAIClosed65.4%
45Mistral Medium 3.5 128BMistralOpen64.9%
46Grok 4.1 Fast (Reasoning)xAIClosed63.3%
47Command A+CohereOpen63.2%
48Claude 4 SonnetAnthropicClosed62.4%
49Llama 4 MaverickMetaOpen62.1%
50Grok 4 Fast (Reasoning)xAIClosed61.8%
51GPT-4.1OpenAIClosed61.2%
52GPT-4.1 miniOpenAIClosed58.7%
53Mistral Small 4 (Reasoning)MistralOpen56.8%
54Mistral Small 4MistralOpen56.8%
55Mistral Large 3MistralClosed55.7%
56Gemini 1.5 ProGoogleClosed55.0%
57Nemotron 3 Nano Omni 30B A3BNVIDIAOpen53.2%
58Mistral Medium 3MistralClosed53.0%
59Llama 4 ScoutMetaOpen52.9%
60Qwen3.5 397BAlibabaOpen52.7%
61Gemma 4 E4BGoogleOpen51.4%
62Grok 4.1 FastxAIClosed48.4%
63Gemma 3 27BGoogleOpen48.0%
64Gemma 4 E2BGoogleOpen44.6%
65Nova ProAmazonClosed44.3%
66GPT-4o miniOpenAIClosed41.5%
67GPT-4.1 nanoOpenAIClosed40.1%
68Claude 3 HaikuAnthropicClosed30.8%