BenchLM Benchmarks
165 benchmarks · 3361 model scores · Data from Jun 2, 2026
Coding23 benchmarks
4 models
49 models
13 models
14 models
8 models
6 models
35 models
20 models
7 models
8 models
8 models
16 models
5 models
9 models
119 models
122 models
115 models
Agentic32 benchmarks
24 models
24 models
6 models
113 models
114 models
113 models
20 models
54 models
21 models
10 models
23 models
21 models
116 models
9 models
46 models
25 models
10 models
14 models
5 models
22 models
9 models
4 models
9 models
8 models
6 models
5 models
7 models
Reasoning19 benchmarks
3 models
62 models
41 models
10 models
7 models
6 models
11 models
4 models
29 models
115 models
116 models
63 models
Knowledge27 benchmarks
8 models
54 models
18 models
36 models
36 models
126 models
122 models
122 models
114 models
114 models
113 models
8 models
6 models
5 models
5 models
16 models
6 models
8 models
4 models
Multimodal35 benchmarks
9 models
28 models
68 models
5 models
5 models
3 models
4 models
9 models
4 models
6 models
8 models
4 models
14 models
5 models
7 models
11 models
22 models
3 models
Math18 benchmarks
9 models
6 models
13 models
7 models
8 models
18 models
8 models
8 models
6 models
9 models
4 models
3 models
Multilingual6 benchmarks
10 models
Instruction Following4 benchmarks
19 models
11 models
116 models