Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
309
Tracked models
27
Providers
264
Benchmarked
13.1
Avg. index
309 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 21 | GPT-5.1 gpt-5.1-2025-11-13 multimodalvisionmulti-input reasoning | OpenAI | 66.9 Inference | 65.4 | 66.9 | 0.0 | 55.7 | 33.2 | $1.25 in / $10 out |
| 22 | GPT-5.1 Instant gpt-5.1-instant-2025-11-12 multimodalvisionmulti-input reasoning | OpenAI | 66.9 Inference | 65.4 | 66.9 | 0.0 | 55.7 | 33.2 | |
| 23 | GPT-5.2 gpt-5.2-2025-12-11 multimodalvisionmulti-input reasoning | OpenAI | 66.9 Inference | 75.3 | 66.9 | 44.4 | 70.7 | 27.1 | |
| 24 | GPT-5.5 Instant gpt-5.5-instant multimodalvisionmulti-input reasoning | OpenAI | 66.9 Inference | 53.2 | 66.9 | 0.0 | 0.0 | 16.0 | |
| 25 | Grok-4.1 Fast Non-Reasoning grok-4-1-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 62.1 Inference | 0.0 | 62.1 | 0.0 | 0.0 | 73.7 | |
| 26 | Grok-4.1 Fast Reasoning grok-4-1-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 62.1 Inference | 0.0 | 62.1 | 0.0 | 0.0 | 73.7 | |
| 27 | Grok 4 Fast grok-4-fast multimodalvisionmulti-input reasoning | xAI | 62.1 Inference | 57.1 | 62.1 | 13.7 | 0.0 | 73.7 | $0.2 in / $0.5 out |
| 28 | Grok-4 Fast Non-Reasoning grok-4-fast-non-reasoning multimodalvisionmulti-input reasoning | xAI | 62.1 Inference | 0.0 | 62.1 | 0.0 | 0.0 | 73.7 | |
| 29 | Grok-4 Fast Reasoning grok-4-fast-reasoning multimodalvisionmulti-input reasoning | xAI | 62.1 Inference | 0.0 | 62.1 | 0.0 | 0.0 | 73.7 | |
| 30 | Step-3.5-Flash step-3.5-flash codeprogrammingtool use | StepFun | 60.4 Inference | 62.8 | 60.4 | 42.0 | 50.6 | 95.0 | $0.1 in / $0.4 out |
| 31 | Gemini 3.1 Pro gemini-3.1-pro-preview multimodalvisionmulti-input reasoning | Google | 59.4 Inference | 73.8 | 59.4 | 68.9 | 66.0 | 18.5 | |
| 32 | GPT-5.4 Mini gpt-5.4-mini texttext-to-textlanguage | OpenAI | 56.3 Inference | 56.9 | 56.3 | 22.4 | 27.4 | 32.9 | |
| 33 | GPT-5.4 nano gpt-5.4-nano multimodalvisionmulti-input reasoning | OpenAI | 56.3 Inference | 46.0 | 56.3 | 10.4 | 10.7 | 70.9 | $0.2 in / $1.25 out |
| 34 | Claude Haiku 4.5 claude-haiku-4-5-20251001 multimodalvisionmulti-input reasoning | Anthropic | 55.3 Inference | 31.5 | 55.3 | 53.3 | 54.9 | 38.7 | |
| 35 | DeepSeek-V3.2 (Non-thinking) deepseek-chat textinference | DeepSeek | 53.0 Inference | 0.0 | 53.0 | 0.0 | 0.0 | 79.3 | $0.28 in / $0.42 out |
| 36 | GPT OSS 120B High gpt-oss-120b-high multimodalvisionmulti-input reasoning | OpenAI | 53.0 Inference | 44.2 | 53.0 | 0.0 | 0.0 | 83.3 | |
| 37 | Gemini 2.5 Flash gemini-2.5-flash multimodalvisionmulti-input reasoning | Google | 51.2 Inference | 38.9 | 51.2 | 0.0 | 21.0 | 46.4 | |
| 38 | Gemini 2.5 Pro gemini-2.5-pro multimodalvisionmulti-input reasoning | Google | 51.2 Inference | 43.4 | 51.2 | 0.0 | 22.9 | 25.2 | |
| 39 | GPT-4o gpt-4o-2024-05-13 multimodalvisionmulti-input reasoning | OpenAI | 50.5 Inference | 21.6 | 50.5 | 0.0 | 0.0 | 30.1 | |
| 40 | GPT-5.3 Chat gpt-5.3-chat-latest multimodalvisionmulti-input reasoning | OpenAI | 50.5 Inference | 0.0 | 50.5 | 0.0 | 0.0 | 27.1 |
GPT-5.1
OpenAI
66.9
$1.25 in / $10 out
GPT-5.1 Instant
OpenAI
66.9
$1.25 in / $10 out
GPT-5.2
OpenAI
66.9
$1.75 in / $14 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $1.25 in / $10 out |
| $1.75 in / $14 out |
| $5 in / $30 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $0.2 in / $0.5 out |
| $2.5 in / $15 out |
| $0.75 in / $4.5 out |
| $1 in / $5 out |
| $0.1 in / $0.5 out |
| $0.3 in / $2.5 out |
| $1.25 in / $10 out |
| $2.5 in / $10 out |
| $1.75 in / $14 out |
GPT-5.5 Instant
OpenAI
66.9
$5 in / $30 out
Grok-4.1 Fast Non-Reasoning
xAI
62.1
$0.2 in / $0.5 out
Grok-4.1 Fast Reasoning
xAI
62.1
$0.2 in / $0.5 out
Grok 4 Fast
xAI
62.1
$0.2 in / $0.5 out
Grok-4 Fast Non-Reasoning
xAI
62.1
$0.2 in / $0.5 out
Grok-4 Fast Reasoning
xAI
62.1
$0.2 in / $0.5 out
Step-3.5-Flash
StepFun
60.4
$0.1 in / $0.4 out
Gemini 3.1 Pro
59.4
$2.5 in / $15 out
GPT-5.4 nano
OpenAI
56.3
$0.2 in / $1.25 out
Claude Haiku 4.5
Anthropic
55.3
$1 in / $5 out
DeepSeek-V3.2 (Non-thinking)
DeepSeek
53.0
$0.28 in / $0.42 out
GPT OSS 120B High
OpenAI
53.0
$0.1 in / $0.5 out
Gemini 2.5 Flash
51.2
$0.3 in / $2.5 out
Gemini 2.5 Pro
51.2
$1.25 in / $10 out
GPT-4o
OpenAI
50.5
$2.5 in / $10 out
GPT-5.3 Chat
OpenAI
50.5
$1.75 in / $14 out