Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
296
Tracked models
27
Providers
253
Benchmarked
13.4
Avg. index
296 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 181 | Grok-4.1 Fast Non-Reasoning grok-4-1-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 68.8 | 0.0 | 0.0 | 67.2 | $0.2 in / $0.5 out |
| 182 | Grok-4.1 Fast Reasoning grok-4-1-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 68.8 | 0.0 | 0.0 | 67.2 | |
| 183 | Grok-4.1 Thinking grok-4.1-thinking-2025-11-17 multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 48.5 | 0.0 | 0.0 | 17.8 | |
| 184 | Grok-4.20 Beta Non-Reasoning grok-4.20-beta-0309-non-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 97.2 | 0.0 | 0.0 | 27.7 | |
| 185 | Grok-4.20 Beta Reasoning grok-4.20-beta-0309-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 97.2 | 0.0 | 0.0 | 27.7 | |
| 186 | Grok-4.20 Multi-Agent Beta grok-4.20-multi-agent-beta-0309 multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 187 | Grok 4 Fast grok-4-fast multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 57.6 | 68.8 | 14.7 | 0.0 | 67.2 | $0.2 in / $0.5 out |
| 188 | Grok-4 Fast Non-Reasoning grok-4-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 68.8 | 0.0 | 0.0 | 67.2 | |
| 189 | Grok-4 Fast Reasoning grok-4-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 0.0 | 68.8 | 0.0 | 0.0 | 67.2 | |
| 190 | Grok-4 Heavy grok-4-heavy multimodalvisionmulti-input reasoning | xAI | 0.0 Programming | 72.4 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 191 | Hermes 3 70B hermes-3-70b textinference | Nous Research | 0.0 Programming | 30.1 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 192 | Jamba 1.5 Large jamba-1.5-large textinference | AI21 Labs | 0.0 Programming | 8.1 | 33.6 | 0.0 | 0.0 | 25.2 | $2 in / $8 out |
| 193 | Jamba 1.5 Mini jamba-1.5-mini textinference | AI21 Labs | 0.0 Programming | 4.7 | 65.8 | 0.0 | 0.0 | 72.4 | $0.2 in / $0.4 out |
| 194 | K-EXAONE-236B-A23B k-exaone-236b-a23b multimodalvisionmulti-input reasoning | LG AI Research | 0.0 Programming | 43.4 | 24.9 | 0.0 | 0.0 | 49.2 | $0.6 in / $1 out |
| 195 | Kimi-k1.5 kimi-k1.5 multimodalvisionmulti-input reasoning | Moonshot AI | 0.0 Programming | 35.3 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 196 | Kimi K2 0905 kimi-k2-0905 textinference | Moonshot AI | 0.0 Programming | 44.0 | 66.0 | 0.0 | 0.0 | 40.1 | $0.6 in / $2.5 out |
| 197 | Kimi K2 Base kimi-k2-base textinference | Moonshot AI | 0.0 Programming | 26.9 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 198 | Llama 3.1 405B Instruct llama-3.1-405b-instruct textinference | Meta | 0.0 Programming | 20.0 | 21.4 | 0.0 | 0.0 | 44.5 | $0.89 in / $0.89 out |
| 199 | Llama 3.1 70B Instruct llama-3.1-70b-instruct textinference | Meta | 0.0 Programming | 11.2 | 21.4 | 0.0 | 0.0 | 72.2 | $0.2 in / $0.2 out |
| 200 | Llama 3.1 8B Instruct llama-3.1-8b-instruct textinference | Meta | 0.0 Programming | 3.2 | 26.7 | 0.0 | 0.0 | 83.9 | $0.03 in / $0.03 out |
Grok-4.1 Fast Non-Reasoning
xAI
0.0
$0.2 in / $0.5 out
Grok-4.1 Fast Reasoning
xAI
0.0
$0.2 in / $0.5 out
Grok-4.1 Thinking
xAI
0.0
$3 in / $15 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $0.2 in / $0.5 out |
| $3 in / $15 out |
| $2 in / $6 out |
| $2 in / $6 out |
| N/A |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
Grok-4.20 Beta Non-Reasoning
xAI
0.0
$2 in / $6 out
Grok-4.20 Beta Reasoning
xAI
0.0
$2 in / $6 out
Grok-4.20 Multi-Agent Beta
xAI
0.0
N/A
Grok 4 Fast
xAI
0.0
$0.2 in / $0.5 out
Grok-4 Fast Non-Reasoning
xAI
0.0
$0.2 in / $0.5 out
Grok-4 Fast Reasoning
xAI
0.0
$0.2 in / $0.5 out
Grok-4 Heavy
xAI
0.0
N/A
Hermes 3 70B
Nous Research
0.0
N/A
Jamba 1.5 Large
AI21 Labs
0.0
$2 in / $8 out
Jamba 1.5 Mini
AI21 Labs
0.0
$0.2 in / $0.4 out
K-EXAONE-236B-A23B
LG AI Research
0.0
$0.6 in / $1 out
Kimi-k1.5
Moonshot AI
0.0
N/A
Kimi K2 0905
Moonshot AI
0.0
$0.6 in / $2.5 out
Kimi K2 Base
Moonshot AI
0.0
N/A
Llama 3.1 405B Instruct
Meta
0.0
$0.89 in / $0.89 out
Llama 3.1 70B Instruct
Meta
0.0
$0.2 in / $0.2 out
Llama 3.1 8B Instruct
Meta
0.0
$0.03 in / $0.03 out