context.vn

lisan Bench

62 models evaluated

#ModelProviderTypeScore
4GPT 5.4 (medium)openai/gpt-5.4:thinking-mediumOpenAIClosed
6Gemini 3.1 Pro Preview (high)google/gemini-3.1-pro-preview:thinking-highGoogleClosed
7Grok 4 (medium)x-ai/grok-4:thinking-mediumxAIClosed
8Grok 4.20 Beta (thinking)x-ai/grok-4.20-beta:thinkingxAIClosed
9GPT 5 (medium)openai/gpt-5OpenAIClosed
11O3 (medium)openai/o3:thinking-mediumOpenAIClosed
12GPT 5.2 (medium)openai/gpt-5.2:thinking-mediumOpenAIClosed
13Gemini 3 Pro Preview (high)google/gemini-3-pro-previewGoogleClosed
15Deepseek V3.2 (thinking)deepseek/deepseek-v3.2:thinkingDeepSeekOpen
17Step 3.5 Flash (thinking)zenmux/step-3.5-flash:thinkingStepFunOpen
19GPT 5 Mini (medium)openai/gpt-5-miniOpenAIClosed
20Kimi K2.5 (thinking)moonshotai/kimi-k2.5:thinkingMoonshot AIClosed
21Grok 4.1 Fast (thinking)x-ai/grok-4.1-fast:thinkingxAIClosed
22Gemini 3 Flash Preview (high)google/gemini-3-flash-previewGoogleClosed
23GPT 5 Nano (medium)openai/gpt-5-nanoOpenAIClosed
24Kimi K2 (thinking)moonshotai/kimi-k2-thinkingMoonshot AIClosed
25GPT 5.4 Mini (medium)openai/gpt-5.4-mini:thinking-mediumOpenAIClosed
27GPT 5.4 Nano (medium)openai/gpt-5.4-nano:thinking-mediumOpenAIClosed
28O3 Mini (medium)openai/o3-miniOpenAIClosed
30GPT-OSS-120B (medium)openai/gpt-oss-120bOpenAIOpen
31Qwen3.5 397B A17B (thinking)qwen/qwen3.5-397b-a17b:thinkingAlibabaOpen
32GLM 5 (thinking)z-ai/glm-5:thinkingZ.AIOpen
33O4 Mini (medium)openai/o4-miniOpenAIClosed
37Qwen3 235B A22B 2507 (thinking)qwen/qwen3-235b-a22b-thinking-2507AlibabaOpen
39Minimax M2.5 (thinking)minimax/minimax-m2.5:thinkingMiniMaxClosed
40Opus 4.1anthropic/claude-opus-4.1AnthropicClosed
41Sonnet 4.6anthropic/claude-sonnet-4.6AnthropicClosed
42Gemini 2.5 Pro (16k)google/gemini-2.5-pro:thinking-16kGoogleClosed
43Grok 3 Mini (medium)x-ai/grok-3-mini:thinking-mediumxAIClosed
44Grok 3 (thinking)x-ai/grok-3xAIClosed
46GPT-OSS-20B (medium)openai/gpt-oss-20bOpenAIOpen
47Sonnet 4anthropic/claude-sonnet-4AnthropicClosed
49Sonnet 3.6anthropic/claude-3.5-sonnetAnthropicClosed
50Deepseek V3.2deepseek/deepseek-v3.2DeepSeekOpen
51Sonnet 4.5anthropic/claude-sonnet-4.5AnthropicClosed
54Gemini Pro 1.5google/gemini-pro-1.5GoogleClosed
56Qwen3.5 122B A10B (thinking)qwen/qwen3.5-122b-a10b:thinkingAlibabaOpen
57Deepseek V3deepseek/deepseek-chatDeepSeekOpen
59GLM 4.5 (thinking)z-ai/glm-4.5Z.AIClosed
60Qwen3.5 35B A3B (thinking)qwen/qwen3.5-35b-a3b:thinkingAlibabaOpen
62Opus 4.5anthropic/claude-opus-4.5AnthropicClosed
63GPT 4oopenai/chatgpt-4o-latestOpenAIClosed
64Opus 4.6anthropic/claude-opus-4.6AnthropicClosed
65Deepseek R1 0528 (thinking)deepseek/deepseek-r1-0528DeepSeekOpen
66GPT 4 Turboopenai/gpt-4-turboOpenAIClosed
67Opus 3anthropic/claude-3-opusAnthropicClosed
70Gemini 2.5 Flashgoogle/gemini-2.5-flashGoogleClosed
71Minimax M1 (thinking)minimax/minimax-m1MiniMaxClosed
73Haiku 4.5anthropic/claude-haiku-4.5AnthropicClosed
76Nova Pro V1amazon/nova-pro-v1AmazonClosed
77GLM 4.7 (thinking)z-ai/glm-4.7Z.AIOpen
79GLM 4.5 Air (thinking)z-ai/glm-4.5-airZ.AIClosed
84GLM 4.6 (thinking)z-ai/glm-4.6Z.AIOpen
86Llama 4 Maverickmeta-llama/llama-4-maverickMetaOpen
88Mistral Medium 3mistralai/mistral-medium-3MistralClosed
91GPT 4.1openai/gpt-4.1OpenAIClosed
99Llama 4 Scoutmeta-llama/llama-4-scoutMetaOpen
101GPT 4.1 Miniopenai/gpt-4.1-miniOpenAIClosed
102Haiku 3anthropic/claude-3-haikuAnthropicClosed
106Mimo V2 Flash (thinking)zenmux/mimo-v2-flash:thinkingXiaomiOpen
111GPT 4o Miniopenai/gpt-4o-miniOpenAIClosed
115GPT 4.1 Nanoopenai/gpt-4.1-nanoOpenAIClosed