context.vn

gert Labs

54 models evaluated

#ModelProviderTypeScore
1Claude Opus 4.8AnthropicClosed72.97%
2GPT-5.5OpenAIClosed72.93%
3Claude Opus 4.7AnthropicClosed65.59%
4GPT-5.4OpenAIClosed64.89%
5Qwen3.7 MaxAlibabaClosed64.27%
6Claude Opus 4.5AnthropicClosed64.23%
7Gemini 3 ProGoogleClosed63.23%
8Claude Sonnet 4.6AnthropicClosed62.92%
9MiMo-V2.5-ProXiaomiClosed62.70%
10Claude Opus 4.6AnthropicClosed61.85%
11Gemini 3.5 FlashGoogleClosed61.85%
12GLM-5.1Z.AIOpen60.11%
13GPT-5.3 CodexOpenAIClosed57.47%
14Gemini 3.1 ProGoogleClosed56.87%
15Kimi K2.6Moonshot AIOpen56.82%
16Gemini 3 FlashGoogleClosed56.63%
17Qwen3.6-27BAlibabaOpen54.84%
18DeepSeek V4 FlashDeepSeekOpen54.35%
19GPT-5.2-CodexOpenAIClosed51.79%
20Step 3.7 FlashStepFunOpen51.57%
21GLM-5Z.AIOpen50.99%
22Qwen3.6 PlusAlibabaClosed50.60%
23DeepSeek V4 ProDeepSeekOpen50.28%
24GPT-5.1-CodexOpenAIClosed49.68%
25Grok Build 0.1xAIClosed49.15%
26Claude Sonnet 4.5AnthropicClosed48.51%
27Grok 4.1 FastxAIClosed47.32%
28MiMo-V2.5XiaomiClosed46.89%
29Qwen3.5 397BAlibabaOpen46.76%
30GPT-5.2OpenAIClosed46.54%
31Kimi K2.5Moonshot AIOpen45.88%
32Grok 4.3xAIClosed43.86%
33Qwen3 MaxAlibabaClosed43.74%
34Qwen3.6-35B-A3BAlibabaOpen42.65%
35Grok 4xAIClosed42.34%
36Gemini 2.5 ProGoogleClosed42.01%
37GPT-5.1OpenAIClosed41.24%
38MiniMax M2.7MiniMaxOpen40.40%
39GLM-4.7Z.AIOpen39.95%
40Claude 4 SonnetAnthropicClosed39.66%
41Qwen3.5-27BAlibabaOpen39.41%
42Mistral Medium 3.5 128BMistralOpen39.10%
43Gemini 3.1 Flash-LiteGoogleClosed38.46%
44Grok 4.20xAIClosed38.36%
45Hy3 PreviewTencentOpen36.91%
46MiMo-V2-ProXiaomiClosed36.68%
47Gemma 4 31BGoogleOpen35.26%
48Kimi K2.5 (Reasoning)Moonshot AIClosed32.58%
49Trinity-Large-ThinkingArcee AIOpen32.55%
50GLM-5V-TurboZ.AIClosed30.76%
51GPT-OSS 120BOpenAIOpen29.61%
52DeepSeek V3.2DeepSeekOpen29.57%
53Qwen3.5-35B-A3BAlibabaOpen28.96%
54GPT-4.1OpenAIClosed25.65%