context.vn

browse Comp

24 models evaluated

#ModelProviderTypeScore
1GPT-5.5 ProOpenAIClosed90.1%
2GPT-5.4 ProOpenAIClosed89.3%
3Claude Mythos PreviewAnthropicClosed86.9%
4GPT-5.5OpenAIClosed84.4%
5Claude Opus 4.8AnthropicClosed84.3%
6Claude Opus 4.6AnthropicClosed83.7%
7MiniMax M3MiniMaxOpen83.5%
8DeepSeek V4 Pro (Max)DeepSeekOpen83.4%
9Kimi K2.6Moonshot AIOpen83.2%
10GPT-5.4OpenAIClosed82.7%
11DeepSeek V4 Pro (High)DeepSeekOpen80.4%
12Claude Opus 4.7 (Adaptive)AnthropicClosed79.3%
13Step 3.7 FlashStepFunOpen75.8%
14DeepSeek V4 Flash (Max)DeepSeekOpen73.2%
15GLM-5.1Z.AIOpen68%
16GPT-5.2OpenAIClosed65.8%
17Qwen3.5-122B-A10BAlibabaOpen63.8%
18Qwen3.5 397BAlibabaOpen62%
19Qwen3.5-27BAlibabaOpen61%
20Qwen3.5-35B-A3BAlibabaOpen61%
21Kimi K2.5 (Reasoning)Moonshot AIClosed60.6%
22Kimi K2.5Moonshot AIOpen60.6%
23DeepSeek V4 Flash (High)DeepSeekOpen53.5%
24GLM-4.7Z.AIOpen52%