context.vn

mmlu

8 models evaluated

#ModelProviderTypeScore
1o1OpenAIClosed91.8%
2GPT-4.1OpenAIClosed90.2%
3DeepSeek V4 Pro BaseDeepSeekOpen90.1%
4DeepSeek V4 Flash BaseDeepSeekOpen88.7%
5GPT-4.1 miniOpenAIClosed87.5%
6Trinity-Large-PreviewArcee AIOpen87.2%
7o3-miniOpenAIClosed86.9%
8GPT-4.1 nanoOpenAIClosed80.1%