Who's winning
Composite score: average rank percentile across benchmarks + breadth bonus. Models with 3+ benchmarks. Higher = consistently ranks near the top across more benchmarks.
1
Anthropic · 13 benchmarks
92.2
2
OpenAI · 5 benchmarks
80.2
3
Alibaba · 7 benchmarks
78.0
4
OpenAI · 5 benchmarks
77.9
5
Google · 13 benchmarks
77.6
6
Google · 3 benchmarks
76.9
7
OpenAI · 4 benchmarks
74.3
8
OpenAI · 6 benchmarks
71.2
9
H Company · 4 benchmarks
69.8
10
Alibaba · 7 benchmarks
69.5
11
ByteDance · 6 benchmarks
69.4
12
DeepSeek · 3 benchmarks
66.3
13
Zhipu AI · 7 benchmarks
65.3
14
OpenAI · 16 benchmarks
63.7
15
Anthropic · 18 benchmarks
60.8
16
Anthropic · 7 benchmarks
60.2
17
Zhipu AI · 5 benchmarks
58.2
18
OpenAI · 4 benchmarks
58.1
19
Moonshot AI · 14 benchmarks
57.8
20
StepFun · 7 benchmarks
57.3
21
Arcee AI · 6 benchmarks
56.3
22
Zhipu AI · 9 benchmarks
54.2
23
OpenAI · 4 benchmarks
52.2
24
ByteDance · 6 benchmarks
52.0
25
Google · 3 benchmarks
48.9
26
Meta · 9 benchmarks
48.8
27
DeepSeek · 5 benchmarks
48.6
28
xAI · 6 benchmarks
48.4
29
OpenAI · 4 benchmarks
47.3
30
Alibaba · 11 benchmarks
47.1