context.vn

cyber Gym

10 models evaluated

#ModelProviderTypeScore
1Claude Mythos PreviewAnthropicClosed83.1%
2GPT-5.5OpenAIClosed81.8%
3GPT-5.4OpenAIClosed79.0%
4Claude Opus 4.7 (Adaptive)AnthropicClosed73.1%
5GLM-5.1Z.AIOpen68.7%
6Claude Opus 4.6AnthropicClosed66.6%
7Claude Sonnet 4.6AnthropicClosed65.2%
8Claude Opus 4.5AnthropicClosed50.6%
9Muse SparkMetaClosed43.5%
10GLM-5Z.AIOpen43.2%