context.vn

omniscience Hallucination Rate

113 models evaluated

#ModelProviderTypeScore
1Command A+CohereOpen14.1%
2Qwen3.7 MaxAlibabaClosed22.9%
3MiMo-V2.5-ProXiaomiClosed24.5%
4Grok 4.3xAIClosed25.0%
5GLM-5.1Z.AIOpen29.4%
6MiMo-V2-ProXiaomiClosed29.9%
7Gemma 4 E4BGoogleOpen31.3%
8Qwen3.6 PlusAlibabaClosed32.0%
9Gemma 4 E2BGoogleOpen32.9%
10GLM-5Z.AIOpen34.0%
11MiniMax M2.7MiniMaxOpen34.4%
12Claude Opus 4.7 (Adaptive)AnthropicClosed36.2%
13GPT-4oOpenAIClosed37.9%
14Kimi K2.6Moonshot AIOpen39.3%
15Claude 4 SonnetAnthropicClosed40.8%
16Qwen 3.6 Max (preview)AlibabaClosed44.2%
17MiMo-V2-OmniXiaomiClosed44.4%
18Qwen3.6-27BAlibabaOpen48.3%
19Qwen3.6-35B-A3BAlibabaOpen49.7%
20Gemini 3.1 ProGoogleClosed49.9%
21Llama 3.1 405BMetaOpen51.0%
22GPT-5.1OpenAIClosed51.3%
23Claude Opus 4.7AnthropicClosed51.9%
24Claude Opus 4.5 ThinkingAnthropicClosed59.8%
25Gemini 3.5 FlashGoogleClosed60.7%
26Mistral Medium 3MistralClosed60.9%
27Claude Opus 4.6 (Adaptive)AnthropicClosed61.3%
28GLM-5-TurboZ.AIClosed62.2%
29Grok 4xAIClosed64.2%
30Kimi K2.5 (Reasoning)Moonshot AIClosed64.6%
31Kimi K2.5Moonshot AIOpen64.6%
32Claude Sonnet 4.6AnthropicClosed65.9%
33Grok 4 Fast (Reasoning)xAIClosed66.0%
34GLM-4.6Z.AIOpen66.1%
35Mistral Small 4 (Reasoning)MistralOpen66.8%
36Mistral Small 4MistralOpen66.8%
37Mistral Large 2MistralClosed67.8%
38GLM-5V-TurboZ.AIClosed67.9%
39o1OpenAIClosed69.3%
40Grok 4.1 Fast (Reasoning)xAIClosed72.4%
41GPT-5.2-CodexOpenAIClosed72.8%
42Muse SparkMetaClosed73.2%
43GPT-5.4 nanoOpenAIClosed73.6%
44Kimi K2Moonshot AIClosed74.2%
45GPT-5.1-Codex-MaxOpenAIClosed74.4%
46GPT-5.1-CodexOpenAIClosed74.4%
47MiMo-V2-FlashXiaomiOpen75.1%
48Claude Opus 4.5AnthropicClosed75.4%
49Claude Opus 4.6AnthropicClosed76.0%
50Granite-4.0-350MIBMOpen77.8%
51Nova ProAmazonClosed77.9%
52Claude 3 HaikuAnthropicClosed78.2%
53Llama 4 ScoutMetaOpen78.3%
54Grok Code Fast 1xAIClosed78.5%
55GPT-4.1OpenAIClosed79.6%
56GPT-5.2OpenAIClosed79.7%
57Qwen3.5-27BAlibabaOpen79.7%
58Qwen3.5 397BAlibabaOpen79.8%
59GPT-5 (medium)OpenAIClosed80.1%
60DeepSeek V3.1 (Reasoning)DeepSeekOpen80.3%
61GPT-4.1 nanoOpenAIClosed80.4%
62Phi-4MicrosoftOpen80.5%
63Gemma 4 26B A4BGoogleOpen80.9%
64Exaone 4.0 32BLG AI ResearchOpen81.0%
65Gemini 3.1 Flash-LiteGoogleClosed81.6%
66Gemma 4 31BGoogleOpen81.6%
67Nemotron Ultra 253BNVIDIAOpen81.7%
68Grok 4.1 FastxAIClosed81.8%
69GPT-4.1 miniOpenAIClosed82.0%
70Mistral Medium 3.5 128BMistralOpen82.0%
71GPT-5 (high)OpenAIClosed82.1%
72Nemotron 3 Nano Omni 30B A3BNVIDIAOpen83.1%
73Granite-4.0-H-1BIBMOpen83.4%
74DeepSeek V3.1DeepSeekOpen83.5%
75Mistral Large 3MistralClosed83.7%
76Qwen3.5-35B-A3BAlibabaOpen84.0%
77DeepSeek-R1DeepSeekOpen84.0%
78GPT-5.5OpenAIClosed85.5%
79Qwen3.5-122B-A10BAlibabaOpen85.5%
80Trinity-Large-PreviewArcee AIOpen86.6%
81Trinity-Large-ThinkingArcee AIOpen86.6%
82GPT-5.3 CodexOpenAIClosed86.9%
83Hy3 PreviewTencentOpen86.9%
84o3OpenAIClosed87.1%
85Llama 4 MaverickMetaOpen87.3%
86Gemini 2.5 ProGoogleClosed87.4%
87GPT-5.4OpenAIClosed88.6%
88DeepSeek V4 Pro (High)DeepSeekOpen88.6%
89Qwen3.5 397B (Reasoning)AlibabaOpen89.1%
90K-ExaoneLG AI ResearchClosed89.1%
91DeepSeek V3DeepSeekOpen89.4%
92Qwen3 MaxAlibabaClosed89.4%
93Gemma 3 27BGoogleOpen89.5%
94DeepSeek V4 Flash (High)DeepSeekOpen89.7%
95GPT-5.4 miniOpenAIClosed89.8%
96Gemini 3 FlashGoogleClosed90.2%
97GLM-4.7Z.AIOpen90.3%
98Gemini 3 ProGoogleClosed90.9%
99Nemotron 3 Nano 30BNVIDIAOpen90.9%
100GPT-OSS 120BOpenAIOpen91.2%
101Solar Pro 2UpstageClosed91.5%
102Exaone 4.0 1.2BLG AI ResearchOpen91.5%
103GLM-4.5-AirZ.AIClosed92.3%
104Gemini 2.5 FlashGoogleClosed93.3%
105DeepSeek V3.2DeepSeekOpen93.5%
106Sarvam 105BSarvamOpen93.5%
107Granite-4.0-1BIBMOpen93.5%
108DeepSeek V4 Pro (Max)DeepSeekOpen94.0%
109GPT-OSS 20BOpenAIOpen94.1%
110Granite-4.0-H-350MIBMOpen94.4%
111DeepSeek V4 Flash (Max)DeepSeekOpen95.8%
112Ling 2.6 FlashInclusionAIOpen95.8%
113Sarvam 30BSarvamOpen97.0%