context.vn

mmmu Pro

28 models evaluated

#ModelProviderTypeScore
1GPT-5.4 ProOpenAIClosed94%
2Claude Mythos PreviewAnthropicClosed92.7%
3Gemini 3.1 ProGoogleClosed83.9%
4Gemini 3.5 FlashGoogleClosed83.6%
5GPT-5.5OpenAIClosed81.2%
6GPT-5.4OpenAIClosed81.2%
7Gemini 3 ProGoogleClosed81%
8Muse SparkMetaClosed80.4%
9GPT-5.2OpenAIClosed79.5%
10Kimi K2.6Moonshot AIOpen79.4%
11Qwen3.5 397BAlibabaOpen79%
12Qwen3.6 PlusAlibabaClosed78.8%
13Kimi K2.5 (Reasoning)Moonshot AIClosed78.5%
14Kimi K2.5Moonshot AIOpen78.5%
15MiniMax M3MiniMaxOpen78.1%
16Grok 4.3xAIClosed78.1%
17MiMo-V2.5XiaomiClosed77.9%
18Claude Opus 4.6AnthropicClosed77.3%
19Gemma 4 31BGoogleOpen76.9%
20GPT-5.4 miniOpenAIClosed76.6%
21Qwen3.6-27BAlibabaOpen75.8%
22Qwen3.6-35B-A3BAlibabaOpen75.3%
23Grok 4.20xAIClosed75.2%
24Gemma 4 26B A4BGoogleOpen73.8%
25Interfaze BetaInterfazeClosed71.1%
26Claude Opus 4.5AnthropicClosed70.6%
27GPT-5.4 nanoOpenAIClosed66.1%
28Command A+CohereOpen63%