context.vn
Models
sources
models.dev
arena.ai
artificialanalysis.ai
benchlm.ai
BenchLM Benchmarks
165 benchmarks · 12 model scores · Data from Jun 2, 2026
Search
All
Coding
Agentic
Reasoning
Knowledge
Multimodal
Math
Multilingual
Instruction Following
External
External
1 benchmarks
deep Swe
12 models
1
gpt-5.5[xhigh]
OpenAI
70%
2
gpt-5.4[xhigh]
OpenAI
56%
3
claude-opus-4.7[max]
Anthropic
54%
4
claude-sonnet-4.6[high]
Anthropic
32%
5
gemini-3.5-flash[medium]
Google
28%
+7 more