context.vn

mgsm

2 models evaluated

#ModelProviderTypeScore
1DeepSeek V4 Flash BaseDeepSeekOpen85.7%
2DeepSeek V4 Pro BaseDeepSeekOpen84.4%