context.vn

critpt

116 models evaluated

#ModelProviderTypeScore
1GPT-5.4 ProOpenAIClosed30.0%
2GPT-5.5OpenAIClosed27.1%
3Gemini 3 Pro Deep ThinkGoogleClosed25.7%
4GPT-5.4OpenAIClosed23.4%
5Gemini 3.1 ProGoogleClosed17.7%
6GPT-5.3 CodexOpenAIClosed16.9%
7Qwen3.7 MaxAlibabaClosed13.4%
8Gemini 3.5 FlashGoogleClosed13.1%
9DeepSeek V4 Pro (Max)DeepSeekOpen12.9%
10Claude Opus 4.6 (Adaptive)AnthropicClosed12.6%
11Claude Opus 4.7 (Adaptive)AnthropicClosed12.0%
12GPT-5.2OpenAIClosed11.6%
13Muse SparkMetaClosed11.3%
14DeepSeek V4 Pro (High)DeepSeekOpen10.0%
15GPT-5.4 miniOpenAIClosed10.0%
16GPT-5.4 nanoOpenAIClosed9.3%
17Gemini 3 ProGoogleClosed9.1%
18GPT-5.2-CodexOpenAIClosed8.7%
19Kimi K2.6Moonshot AIOpen8.0%
20Grok 4.3xAIClosed8.0%
21DeepSeek V4 Flash (Max)DeepSeekOpen7.1%
22GPT-5 (high)OpenAIClosed5.7%
23GPT-5.1-Codex-MaxOpenAIClosed5.7%
24GPT-5.1-CodexOpenAIClosed5.7%
25Claude Opus 4.7AnthropicClosed5.1%
26GPT-5.1OpenAIClosed4.9%
27GLM-5.1Z.AIOpen4.6%
28Hy3 PreviewTencentOpen4.6%
29Claude Opus 4.5 ThinkingAnthropicClosed4.6%
30MiMo-V2.5-ProXiaomiClosed4.0%
31Qwen 3.6 Max (preview)AlibabaClosed3.7%
32DeepSeek V4 Flash (High)DeepSeekOpen3.4%
33Kimi K2.5 (Reasoning)Moonshot AIClosed3.1%
34Kimi K2.5Moonshot AIOpen3.1%
35Qwen3.6 PlusAlibabaClosed2.9%
36Grok 4.1 Fast (Reasoning)xAIClosed2.9%
37Grok 4 Fast (Reasoning)xAIClosed2.9%
38Claude Opus 4.6AnthropicClosed2.8%
39Gemini 2.5 ProGoogleClosed2.6%
40GLM-5Z.AIOpen2.0%
41Grok 4xAIClosed2.0%
42DeepSeek V3.1 (Reasoning)DeepSeekOpen2.0%
43Qwen3.5 397B (Reasoning)AlibabaOpen1.7%
44GLM-4.7Z.AIOpen1.7%
45Gemini 3 FlashGoogleClosed1.4%
46Gemini 2.5 FlashGoogleClosed1.4%
47DeepSeek-R1DeepSeekOpen1.4%
48GPT-OSS 20BOpenAIOpen1.4%
49Gemma 4 31BGoogleOpen1.4%
50Qwen3.6-27BAlibabaOpen1.1%
51o3OpenAIClosed1.1%
52Claude 4 SonnetAnthropicClosed1.1%
53Gemini 3.1 Flash-LiteGoogleClosed1.1%
54GPT-OSS 120BOpenAIOpen1.1%
55MiMo-V2-OmniXiaomiClosed1.1%
56K-ExaoneLG AI ResearchClosed1.1%
57Claude Sonnet 4.6AnthropicClosed0.9%
58Qwen3.5 397BAlibabaOpen0.9%
59Qwen3.5-27BAlibabaOpen0.9%
60DeepSeek V3.2DeepSeekOpen0.9%
61Qwen3.5-35B-A3BAlibabaOpen0.9%
62Trinity-Large-PreviewArcee AIOpen0.9%
63Trinity-Large-ThinkingArcee AIOpen0.9%
64Qwen3.5-122B-A10BAlibabaOpen0.6%
65MiniMax M2.7MiniMaxOpen0.6%
66GLM-5V-TurboZ.AIClosed0.6%
67Gemma 4 E4BGoogleOpen0.6%
68Claude Opus 4.5AnthropicClosed0.3%
69Qwen3.6-35B-A3BAlibabaOpen0.3%
70o1OpenAIClosed0.3%
71MiMo-V2-ProXiaomiClosed0.3%
72Mistral Small 4 (Reasoning)MistralOpen0.3%
73Mistral Small 4MistralOpen0.3%
74Sarvam 30BSarvamOpen0.3%
75Command A+CohereOpen0.3%
76GLM-5-TurboZ.AIClosed0.3%
77GPT-5 (medium)OpenAIClosed0.0%
78Grok 4.1 FastxAIClosed0.0%
79MiMo-V2-FlashXiaomiOpen0.0%
80GPT-4.1OpenAIClosed0.0%
81Mistral Large 3MistralClosed0.0%
82GPT-4.1 miniOpenAIClosed0.0%
83Claude 4.1 Opus ThinkingAnthropicClosed0.0%
84GPT-4oOpenAIClosed0.0%
85Llama 3.1 405BMetaOpen0.0%
86Kimi K2Moonshot AIClosed0.0%
87Grok Code Fast 1xAIClosed0.0%
88Sarvam 105BSarvamOpen0.0%
89Mistral Large 2MistralClosed0.0%
90DeepSeek V3DeepSeekOpen0.0%
91Phi-4MicrosoftOpen0.0%
92GPT-4.1 nanoOpenAIClosed0.0%
93DeepSeek V3.1DeepSeekOpen0.0%
94Nemotron 3 Nano 30BNVIDIAOpen0.0%
95Claude 3 HaikuAnthropicClosed0.0%
96Llama 4 ScoutMetaOpen0.0%
97Nemotron Ultra 253BNVIDIAOpen0.0%
98GLM-4.5-AirZ.AIClosed0.0%
99Gemma 3 27BGoogleOpen0.0%
100Llama 4 MaverickMetaOpen0.0%
101Nova ProAmazonClosed0.0%
102Mistral Medium 3.5 128BMistralOpen0.0%
103Exaone 4.0 32BLG AI ResearchOpen0.0%
104Gemma 4 26B A4BGoogleOpen0.0%
105Nemotron 3 Nano Omni 30B A3BNVIDIAOpen0.0%
106Mistral Medium 3MistralClosed0.0%
107Ling 2.6 FlashInclusionAIOpen0.0%
108Gemma 4 E2BGoogleOpen0.0%
109GLM-4.6Z.AIOpen0.0%
110Qwen3 MaxAlibabaClosed0.0%
111Granite-4.0-1BIBMOpen0.0%
112Granite-4.0-H-1BIBMOpen0.0%
113Solar Pro 2UpstageClosed0.0%
114Exaone 4.0 1.2BLG AI ResearchOpen0.0%
115Granite-4.0-350MIBMOpen0.0%
116Granite-4.0-H-350MIBMOpen0.0%