context.vn

aa Sci Code

122 models evaluated

#ModelProviderTypeScore
1Gemini 3.1 ProGoogleClosed58.9%
2GPT-5.4OpenAIClosed56.6%
3GPT-5.5OpenAIClosed56.1%
4Gemini 3 ProGoogleClosed56.1%
5GPT-5.2-CodexOpenAIClosed54.6%
6Claude Opus 4.7 (Adaptive)AnthropicClosed54.5%
7Kimi K2.6Moonshot AIOpen53.5%
8GPT-5.3 CodexOpenAIClosed53.2%
9Gemini 3.5 FlashGoogleClosed53.1%
10GPT-5.2OpenAIClosed52.1%
11Claude Opus 4.6 (Adaptive)AnthropicClosed51.9%
12Muse SparkMetaClosed51.5%
13MiMo-V2.5-ProXiaomiClosed50.2%
14Claude Opus 4.7AnthropicClosed50.1%
15DeepSeek V4 Pro (Max)DeepSeekOpen50.0%
16Gemini 3 FlashGoogleClosed49.9%
17GPT-5.4 miniOpenAIClosed49.9%
18Claude Opus 4.5 ThinkingAnthropicClosed49.5%
19Kimi K2.5 (Reasoning)Moonshot AIClosed49.0%
20Kimi K2.5Moonshot AIOpen49.0%
21Qwen3.7 MaxAlibabaClosed48.8%
22Grok 4.3xAIClosed47.3%
23Claude Opus 4.5AnthropicClosed47.0%
24MiniMax M2.7MiniMaxOpen47.0%
25Claude Sonnet 4.6AnthropicClosed46.9%
26Qwen 3.6 Max (preview)AlibabaClosed46.9%
27GPT-5.4 nanoOpenAIClosed46.9%
28DeepSeek V4 Pro (High)DeepSeekOpen46.4%
29GLM-5Z.AIOpen46.2%
30Claude Opus 4.6AnthropicClosed45.7%
31Grok 4xAIClosed45.7%
32GLM-4.7Z.AIOpen45.1%
33DeepSeek V4 Flash (Max)DeepSeekOpen44.9%
34Grok 4.1 Fast (Reasoning)xAIClosed44.2%
35Grok 4 Fast (Reasoning)xAIClosed44.2%
36GLM-5.1Z.AIOpen43.8%
37GLM-5-TurboZ.AIClosed43.6%
38GLM-5V-TurboZ.AIClosed43.5%
39Gemma 4 31BGoogleOpen43.4%
40GPT-5.1OpenAIClosed43.3%
41GPT-5 (high)OpenAIClosed42.9%
42Gemini 2.5 ProGoogleClosed42.8%
43MiMo-V2-ProXiaomiClosed42.5%
44Qwen3.5 397B (Reasoning)AlibabaOpen42.0%
45DeepSeek V4 Flash (High)DeepSeekOpen42.0%
46Qwen3.5-122B-A10BAlibabaOpen42.0%
47Gemini 3.1 Flash-LiteGoogleClosed41.9%
48Hy3 PreviewTencentOpen41.2%
49GPT-5 (medium)OpenAIClosed41.1%
50Qwen3.5 397BAlibabaOpen41.1%
51o3OpenAIClosed41.0%
52Claude 4.1 Opus ThinkingAnthropicClosed40.9%
53Qwen3.6 PlusAlibabaClosed40.7%
54GPT-4.1 miniOpenAIClosed40.4%
55DeepSeek-R1DeepSeekOpen40.3%
56GPT-5.1-Codex-MaxOpenAIClosed40.2%
57GPT-5.1-CodexOpenAIClosed40.2%
58Gemma 4 26B A4BGoogleOpen40.0%
59o3-miniOpenAIClosed39.9%
60Qwen3.6-27BAlibabaOpen39.8%
61Mistral Medium 3.5 128BMistralOpen39.6%
62Qwen3.5-27BAlibabaOpen39.5%
63DeepSeek V3.1 (Reasoning)DeepSeekOpen39.1%
64GPT-OSS 120BOpenAIOpen38.9%
65DeepSeek V3.2DeepSeekOpen38.7%
66Qwen3 MaxAlibabaClosed38.3%
67GPT-4.1OpenAIClosed38.1%
68Mistral Small 4 (Reasoning)MistralOpen38.0%
69Mistral Small 4MistralOpen38.0%
70Command A+CohereOpen37.8%
71Qwen3.5-35B-A3BAlibabaOpen37.7%
72DeepSeek R1 Distill Qwen 32BDeepSeekOpen37.6%
73Claude 4 SonnetAnthropicClosed37.3%
74DeepSeek V3.1DeepSeekOpen36.7%
75MiMo-V2-OmniXiaomiClosed36.7%
76Mistral Large 3MistralClosed36.2%
77Grok Code Fast 1xAIClosed36.2%
78Trinity-Large-PreviewArcee AIOpen36.1%
79Trinity-Large-ThinkingArcee AIOpen36.1%
80Qwen3.6-35B-A3BAlibabaOpen35.8%
81o1OpenAIClosed35.8%
82K-ExaoneLG AI ResearchClosed35.6%
83DeepSeek V3DeepSeekOpen35.4%
84Nemotron Ultra 253BNVIDIAOpen34.7%
85Kimi K2Moonshot AIClosed34.5%
86GPT-OSS 20BOpenAIOpen34.4%
87GPT-4oOpenAIClosed33.3%
88Llama 4 MaverickMetaOpen33.1%
89Mistral Medium 3MistralClosed33.1%
90GLM-4.6Z.AIOpen33.1%
91GPT-4 TurboOpenAIClosed31.9%
92GLM-4.5-AirZ.AIClosed30.6%
93Llama 3.1 405BMetaOpen29.9%
94Grok 4.1 FastxAIClosed29.6%
95Gemini 1.5 ProGoogleClosed29.5%
96Mistral Large 2MistralClosed29.2%
97Gemini 2.5 FlashGoogleClosed29.1%
98Nemotron 3 Nano Omni 30B A3BNVIDIAOpen27.8%
99Ling 2.6 FlashInclusionAIOpen27.1%
100Qwen2.5 Coder 32B InstructAlibabaOpen27.1%
101Sarvam 105BSarvamOpen26.4%
102Phi-4MicrosoftOpen26.0%
103MiMo-V2-FlashXiaomiOpen25.9%
104GPT-4.1 nanoOpenAIClosed25.9%
105Exaone 4.0 32BLG AI ResearchOpen25.2%
106Solar Pro 2UpstageClosed24.8%
107Gemma 4 E4BGoogleOpen24.4%
108Claude 3 OpusAnthropicClosed23.3%
109Nemotron 3 Nano 30BNVIDIAOpen23.0%
110GPT-4o miniOpenAIClosed22.9%
111Gemma 3 27BGoogleOpen21.2%
112Gemma 4 E2BGoogleOpen20.9%
113Nova ProAmazonClosed20.8%
114Sarvam 30BSarvamOpen19.2%
115Claude 3 HaikuAnthropicClosed18.6%
116Llama 4 ScoutMetaOpen17.0%
117Gemini 1.0 ProGoogleClosed11.7%
118Granite-4.0-1BIBMOpen8.7%
119Granite-4.0-H-1BIBMOpen8.2%
120Exaone 4.0 1.2BLG AI ResearchOpen7.4%
121Granite-4.0-H-350MIBMOpen1.7%
122Granite-4.0-350MIBMOpen0.9%