context.vn

hle With Tools

6 models evaluated

#ModelProviderTypeScore
1Qwen3.7 MaxAlibabaClosed53.5%
2DeepSeek V4 Pro (Max)DeepSeekOpen48.2%
3Step 3.7 FlashStepFunOpen47.2%
4DeepSeek V4 Flash (Max)DeepSeekOpen45.1%
5DeepSeek V4 Pro (High)DeepSeekOpen44.7%
6DeepSeek V4 Flash (High)DeepSeekOpen40.3%