context.vn

aa Hle

122 models evaluated

#ModelProviderTypeScore
1Gemini 3.1 ProGoogleClosed44.7%
2GPT-5.5OpenAIClosed44.3%
3GPT-5.4OpenAIClosed41.6%
4Gemini 3.5 FlashGoogleClosed41.0%
5GPT-5.3 CodexOpenAIClosed39.9%
6Muse SparkMetaClosed39.9%
7Claude Opus 4.7 (Adaptive)AnthropicClosed39.6%
8Qwen3.7 MaxAlibabaClosed38.1%
9Gemini 3 ProGoogleClosed37.2%
10Claude Opus 4.6 (Adaptive)AnthropicClosed36.7%
11DeepSeek V4 Pro (Max)DeepSeekOpen35.9%
12Kimi K2.6Moonshot AIOpen35.9%
13GPT-5.2OpenAIClosed35.4%
14Grok 4.3xAIClosed35.0%
15MiMo-V2.5-ProXiaomiClosed33.8%
16DeepSeek V4 Pro (High)DeepSeekOpen33.5%
17GPT-5.2-CodexOpenAIClosed33.5%
18DeepSeek V4 Flash (Max)DeepSeekOpen32.1%
19Claude Opus 4.7AnthropicClosed31.2%
20Kimi K2.5 (Reasoning)Moonshot AIClosed29.4%
21Kimi K2.5Moonshot AIOpen29.4%
22Qwen 3.6 Max (preview)AlibabaClosed28.9%
23Claude Opus 4.5 ThinkingAnthropicClosed28.4%
24MiMo-V2-ProXiaomiClosed28.3%
25MiniMax M2.7MiniMaxOpen28.1%
26GLM-5.1Z.AIOpen28.0%
27DeepSeek V4 Flash (High)DeepSeekOpen27.8%
28Qwen3.5 397B (Reasoning)AlibabaOpen27.3%
29GLM-5Z.AIOpen27.2%
30GPT-5.4 miniOpenAIClosed26.6%
31GPT-5.1OpenAIClosed26.5%
32GPT-5 (high)OpenAIClosed26.5%
33GPT-5.4 nanoOpenAIClosed26.5%
34Qwen3.6 PlusAlibabaClosed25.7%
35Hy3 PreviewTencentOpen25.5%
36GLM-5-TurboZ.AIClosed25.4%
37GLM-4.7Z.AIOpen25.1%
38Grok 4xAIClosed23.9%
39GPT-5 (medium)OpenAIClosed23.5%
40GPT-5.1-Codex-MaxOpenAIClosed23.4%
41Qwen3.5-122B-A10BAlibabaOpen23.4%
42GPT-5.1-CodexOpenAIClosed23.4%
43Gemma 4 31BGoogleOpen22.7%
44Qwen3.5-27BAlibabaOpen22.2%
45Qwen3.6-27BAlibabaOpen21.6%
46Gemini 2.5 ProGoogleClosed21.1%
47Qwen3.6-35B-A3BAlibabaOpen20.2%
48o3OpenAIClosed20.0%
49MiMo-V2-OmniXiaomiClosed19.9%
50Qwen3.5-35B-A3BAlibabaOpen19.7%
51Qwen3.5 397BAlibabaOpen18.8%
52Claude Opus 4.6AnthropicClosed18.6%
53GPT-OSS 120BOpenAIOpen18.5%
54Gemma 4 26B A4BGoogleOpen18.3%
55Grok 4.1 Fast (Reasoning)xAIClosed17.6%
56Grok 4 Fast (Reasoning)xAIClosed17.0%
57Gemini 3.1 Flash-LiteGoogleClosed16.2%
58GLM-5V-TurboZ.AIClosed15.8%
59DeepSeek-R1DeepSeekOpen14.9%
60Trinity-Large-PreviewArcee AIOpen14.7%
61Trinity-Large-ThinkingArcee AIOpen14.7%
62Gemini 3 FlashGoogleClosed14.1%
63Claude Sonnet 4.6AnthropicClosed13.2%
64K-ExaoneLG AI ResearchClosed13.1%
65DeepSeek V3.1 (Reasoning)DeepSeekOpen13.0%
66Claude Opus 4.5AnthropicClosed12.9%
67Mistral Medium 3.5 128BMistralOpen12.8%
68Claude 4.1 Opus ThinkingAnthropicClosed11.9%
69Command A+CohereOpen11.4%
70Qwen3 MaxAlibabaClosed11.1%
71DeepSeek V3.2DeepSeekOpen10.5%
72Sarvam 105BSarvamOpen10.1%
73GPT-OSS 20BOpenAIOpen9.8%
74Mistral Small 4 (Reasoning)MistralOpen9.5%
75Mistral Small 4MistralOpen9.5%
76o3-miniOpenAIClosed8.7%
77Nemotron Ultra 253BNVIDIAOpen8.1%
78MiMo-V2-FlashXiaomiOpen8.0%
79o1OpenAIClosed7.7%
80Grok Code Fast 1xAIClosed7.5%
81Kimi K2Moonshot AIClosed7.0%
82Sarvam 30BSarvamOpen7.0%
83GLM-4.5-AirZ.AIClosed6.8%
84Granite-4.0-H-350MIBMOpen6.4%
85DeepSeek V3.1DeepSeekOpen6.3%
86Ling 2.6 FlashInclusionAIOpen6.2%
87Exaone 4.0 1.2BLG AI ResearchOpen5.8%
88Granite-4.0-350MIBMOpen5.7%
89DeepSeek R1 Distill Qwen 32BDeepSeekOpen5.5%
90Nemotron 3 Nano Omni 30B A3BNVIDIAOpen5.3%
91GLM-4.6Z.AIOpen5.2%
92Gemini 2.5 FlashGoogleClosed5.1%
93Granite-4.0-1BIBMOpen5.1%
94Grok 4.1 FastxAIClosed5.0%
95Granite-4.0-H-1BIBMOpen5.0%
96Gemini 1.5 ProGoogleClosed4.9%
97Exaone 4.0 32BLG AI ResearchOpen4.9%
98Llama 4 MaverickMetaOpen4.8%
99Gemma 4 E2BGoogleOpen4.8%
100Gemma 3 27BGoogleOpen4.7%
101GPT-4.1OpenAIClosed4.6%
102GPT-4.1 miniOpenAIClosed4.6%
103Nemotron 3 Nano 30BNVIDIAOpen4.6%
104Gemini 1.0 ProGoogleClosed4.6%
105Llama 4 ScoutMetaOpen4.3%
106Mistral Medium 3MistralClosed4.3%
107Llama 3.1 405BMetaOpen4.2%
108Mistral Large 3MistralClosed4.1%
109Phi-4MicrosoftOpen4.1%
110Claude 4 SonnetAnthropicClosed4.0%
111GPT-4o miniOpenAIClosed4.0%
112Mistral Large 2MistralClosed4.0%
113GPT-4.1 nanoOpenAIClosed3.9%
114Claude 3 HaikuAnthropicClosed3.9%
115Qwen2.5 Coder 32B InstructAlibabaOpen3.8%
116Solar Pro 2UpstageClosed3.8%
117Gemma 4 E4BGoogleOpen3.7%
118DeepSeek V3DeepSeekOpen3.6%
119Nova ProAmazonClosed3.4%
120GPT-4oOpenAIClosed3.3%
121GPT-4 TurboOpenAIClosed3.3%
122Claude 3 OpusAnthropicClosed3.1%