Model comparisons
190 matchups across 51 models. Click any to see the full breakdown.
Claude Opus 4.6Anthropic
0-10
10 benchmarks
Claude Mythos PreviewAnthropic
Claude Opus 4.6Anthropic
4-4
9 benchmarks
GPT-5.4OpenAI
Claude Opus 4.6Anthropic
6-3
9 benchmarks
Gemini 3.1 ProGoogle
Claude Opus 4.6Anthropic
7-2
9 benchmarks
Kimi K2.5Moonshot AI
GPT-5.4OpenAI
9-0
9 benchmarks
Kimi K2.5Moonshot AI
Gemini 3.1 ProGoogle
9-0
9 benchmarks
Kimi K2.5Moonshot AI
Kimi K2.5Moonshot AI
8-1
9 benchmarks
Qwen3.5 27BAlibaba
Claude Opus 4.6Anthropic
6-2
8 benchmarks
Qwen3.5 27BAlibaba
GPT-5.4OpenAI
3-5
8 benchmarks
Gemini 3.1 ProGoogle
GPT-5.4OpenAI
0-8
8 benchmarks
Claude Mythos PreviewAnthropic
Claude Opus 4.6Anthropic
7-0
7 benchmarks
GLM-5Zhipu AI
Claude Opus 4.6Anthropic
5-2
7 benchmarks
Step-3.5-FlashStepFun
Claude Opus 4.6Anthropic
5-2
7 benchmarks
Qwen 3.6 PlusAlibaba
Claude Opus 4.6Anthropic
7-0
7 benchmarks
Claude Sonnet 4.6Anthropic
Gemini 3.1 ProGoogle
7-0
7 benchmarks
Qwen3.5 27BAlibaba
Gemini 3.1 ProGoogle
5-1
7 benchmarks
Muse SparkMeta
Gemini 3.1 ProGoogle
7-0
7 benchmarks
GLM-5Zhipu AI
Gemini 3.1 ProGoogle
7-0
7 benchmarks
Gemma 4 31BGoogle
Kimi K2.5Moonshot AI
0-7
7 benchmarks
Claude Mythos PreviewAnthropic
Kimi K2.5Moonshot AI
3-4
7 benchmarks
Step-3.5-FlashStepFun
Kimi K2.5Moonshot AI
1-6
7 benchmarks
Qwen 3.6 PlusAlibaba
Kimi K2.5Moonshot AI
2-5
7 benchmarks
Claude Sonnet 4.6Anthropic
Kimi K2.5Moonshot AI
3-4
7 benchmarks
Qwen 3.5 397BAlibaba
Qwen3.5 27BAlibaba
0-7
7 benchmarks
Qwen 3.5 397BAlibaba
Claude Opus 4.6Anthropic
4-2
6 benchmarks
Gemma 4 31BGoogle
Claude Opus 4.6Anthropic
5-1
6 benchmarks
GLM-5.1Zhipu AI
Claude Opus 4.6Anthropic
4-2
6 benchmarks
Qwen 3.5 397BAlibaba
Claude Opus 4.6Anthropic
4-2
6 benchmarks
Grok 4xAI
Claude Opus 4.6Anthropic
6-0
6 benchmarks
GLM-4.7-FlashZhipu AI
Claude Opus 4.6Anthropic
6-0
6 benchmarks
Sarvam 105BSarvam AI
GPT-5.4OpenAI
6-0
6 benchmarks
Qwen3.5 27BAlibaba
GPT-5.4OpenAI
4-2
6 benchmarks
Muse SparkMeta
Gemini 3.1 ProGoogle
0-6
6 benchmarks
Claude Mythos PreviewAnthropic
Gemini 3.1 ProGoogle
6-0
6 benchmarks
Step-3.5-FlashStepFun
Gemini 3.1 ProGoogle
5-1
6 benchmarks
Qwen 3.6 PlusAlibaba
Gemini 3.1 ProGoogle
5-1
6 benchmarks
GLM-5.1Zhipu AI
Gemini 3.1 ProGoogle
6-0
6 benchmarks
Claude Sonnet 4.6Anthropic
Gemini 3.1 ProGoogle
6-0
6 benchmarks
Qwen 3.5 397BAlibaba
Kimi K2.5Moonshot AI
2-4
6 benchmarks
GLM-5Zhipu AI
Kimi K2.5Moonshot AI
6-0
6 benchmarks
Gemma 4 31BGoogle
Kimi K2.5Moonshot AI
2-4
6 benchmarks
GLM-5.1Zhipu AI
Kimi K2.5Moonshot AI
3-3
6 benchmarks
Seed 2.0 ProByteDance
Kimi K2.5Moonshot AI
5-1
6 benchmarks
Seed 2.0 LiteByteDance
Kimi K2.5Moonshot AI
6-0
6 benchmarks
GLM-4.7-FlashZhipu AI
Kimi K2.5Moonshot AI
6-0
6 benchmarks
Sarvam 105BSarvam AI
Claude Mythos PreviewAnthropic
6-0
6 benchmarks
GLM-5.1Zhipu AI
Qwen3.5 27BAlibaba
5-1
6 benchmarks
Gemma 4 31BGoogle
Qwen3.5 27BAlibaba
2-4
6 benchmarks
Step-3.5-FlashStepFun
Qwen3.5 27BAlibaba
0-6
6 benchmarks
Qwen 3.6 PlusAlibaba
Qwen3.5 27BAlibaba
1-5
6 benchmarks
Claude Sonnet 4.6Anthropic
Qwen3.5 27BAlibaba
5-1
6 benchmarks
GLM-4.7-FlashZhipu AI
Qwen3.5 27BAlibaba
6-0
6 benchmarks
Sarvam 105BSarvam AI
Step-3.5-FlashStepFun
1-5
6 benchmarks
Qwen 3.6 PlusAlibaba
Step-3.5-FlashStepFun
2-4
6 benchmarks
Claude Sonnet 4.6Anthropic
Step-3.5-FlashStepFun
2-4
6 benchmarks
Qwen 3.5 397BAlibaba
Step-3.5-FlashStepFun
6-0
6 benchmarks
Sarvam 105BSarvam AI
Qwen 3.5 397BAlibaba
6-0
6 benchmarks
Sarvam 105BSarvam AI
Seed 2.0 ProByteDance
5-1
6 benchmarks
Seed 2.0 LiteByteDance
Claude Opus 4.6Anthropic
3-2
5 benchmarks
Seed 2.0 ProByteDance
Claude Opus 4.6Anthropic
3-2
5 benchmarks
Arcee TrinityArcee AI
Claude Opus 4.6Anthropic
3-2
5 benchmarks
Seed 2.0 LiteByteDance
GPT-5.4OpenAI
4-1
5 benchmarks
GLM-5Zhipu AI
GPT-5.4OpenAI
3-2
5 benchmarks
GLM-5.1Zhipu AI
GPT-5.4OpenAI
4-1
5 benchmarks
Claude Sonnet 4.6Anthropic
Gemini 3.1 ProGoogle
4-1
5 benchmarks
Seed 2.0 ProByteDance
Gemini 3.1 ProGoogle
4-1
5 benchmarks
Arcee TrinityArcee AI
Gemini 3.1 ProGoogle
5-0
5 benchmarks
Grok 4xAI
Gemini 3.1 ProGoogle
5-0
5 benchmarks
Seed 2.0 LiteByteDance
Gemini 3.1 ProGoogle
5-0
5 benchmarks
GLM-4.7-FlashZhipu AI
Gemini 3.1 ProGoogle
5-0
5 benchmarks
Sarvam 105BSarvam AI
Kimi K2.5Moonshot AI
0-5
5 benchmarks
Muse SparkMeta
Kimi K2.5Moonshot AI
3-2
5 benchmarks
Arcee TrinityArcee AI
Kimi K2.5Moonshot AI
4-1
5 benchmarks
Grok 4xAI
Claude Mythos PreviewAnthropic
5-0
5 benchmarks
Qwen3.5 27BAlibaba
Claude Mythos PreviewAnthropic
5-0
5 benchmarks
GLM-5Zhipu AI
Claude Mythos PreviewAnthropic
5-0
5 benchmarks
Claude Sonnet 4.6Anthropic
Qwen3.5 27BAlibaba
0-5
5 benchmarks
GLM-5Zhipu AI
Qwen3.5 27BAlibaba
0-5
5 benchmarks
Seed 2.0 ProByteDance
Qwen3.5 27BAlibaba
3-2
5 benchmarks
Arcee TrinityArcee AI
Qwen3.5 27BAlibaba
2-3
5 benchmarks
Grok 4xAI
Qwen3.5 27BAlibaba
1-4
5 benchmarks
Seed 2.0 LiteByteDance
GLM-5Zhipu AI
4-1
5 benchmarks
Step-3.5-FlashStepFun
GLM-5Zhipu AI
0-5
5 benchmarks
Qwen 3.6 PlusAlibaba
GLM-5Zhipu AI
1-3
5 benchmarks
GLM-5.1Zhipu AI
GLM-5Zhipu AI
1-4
5 benchmarks
Claude Sonnet 4.6Anthropic
GLM-5Zhipu AI
5-0
5 benchmarks
GLM-4.7-FlashZhipu AI
Gemma 4 31BGoogle
0-5
5 benchmarks
Qwen 3.6 PlusAlibaba
Gemma 4 31BGoogle
0-5
5 benchmarks
Qwen 3.5 397BAlibaba
Gemma 4 31BGoogle
2-3
5 benchmarks
Arcee TrinityArcee AI
Step-3.5-FlashStepFun
2-3
5 benchmarks
GLM-5.1Zhipu AI
Step-3.5-FlashStepFun
0-5
5 benchmarks
Seed 2.0 ProByteDance
Step-3.5-FlashStepFun
4-1
5 benchmarks
Arcee TrinityArcee AI
Step-3.5-FlashStepFun
3-2
5 benchmarks
Seed 2.0 LiteByteDance
Step-3.5-FlashStepFun
5-0
5 benchmarks
GLM-4.7-FlashZhipu AI
Qwen 3.6 PlusAlibaba
4-1
5 benchmarks
Claude Sonnet 4.6Anthropic
Qwen 3.6 PlusAlibaba
5-0
5 benchmarks
Qwen 3.5 397BAlibaba
Qwen 3.6 PlusAlibaba
3-2
5 benchmarks
Seed 2.0 ProByteDance
Qwen 3.6 PlusAlibaba
3-2
5 benchmarks
Arcee TrinityArcee AI
Qwen 3.6 PlusAlibaba
5-0
5 benchmarks
Grok 4xAI
Qwen 3.6 PlusAlibaba
5-0
5 benchmarks
Seed 2.0 LiteByteDance