context.vn

BenchLM Benchmarks

165 benchmarks · 12 model scores · Data from Jun 2, 2026

External1 benchmarks

deep Swe

12 models

1gpt-5.5[xhigh]OpenAI70%
2gpt-5.4[xhigh]OpenAI56%
3claude-opus-4.7[max]Anthropic54%
4claude-sonnet-4.6[high]Anthropic32%
5gemini-3.5-flash[medium]Google28%
+7 more