Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
296
Tracked models
27
Providers
253
Benchmarked
27.4
Avg. index
296 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 221 | Claude 3 Haiku claude-3-haiku-20240307 multimodalvisionmulti-input reasoning | Anthropic | 5.8 Benchmarks | 5.8 | 61.8 | 0.0 | 0.0 | 57.9 | $0.25 in / $1.25 out |
| 222 | Llama 3.2 3B Instruct llama-3.2-3b-instruct textinference | Meta | 5.2 Benchmarks | 5.2 | 68.9 | 0.0 | 0.0 | 98.8 | $0.01 in / $0.02 out |
| 223 | Jamba 1.5 Mini jamba-1.5-mini textinference | AI21 Labs | 4.7 Benchmarks | 4.7 | 65.8 | 0.0 | 0.0 | 72.4 | $0.2 in / $0.4 out |
| 224 | DeepSeek VL2 Small deepseek-vl2-small multimodalvisionmulti-input reasoning | DeepSeek | 4.6 Benchmarks | 4.6 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 225 | Gemma 3 4B gemma-3-4b-it multimodalvisionmulti-input reasoning | Google | 4.5 Benchmarks | 4.5 | 20.3 | 0.0 | 0.0 | 82.0 | $0.02 in / $0.04 out |
| 226 | GPT-5.1 Codex Mini gpt-5.1-codex-mini multimodalvisionmulti-input reasoning | OpenAI | 4.0 Benchmarks | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 227 | Llama 3.2 11B Instruct llama-3.2-11b-instruct multimodalvisionmulti-input reasoning | Meta | 4.0 Benchmarks | 4.0 | 60.3 | 0.0 | 0.0 | 94.9 | $0.05 in / $0.05 out |
| 228 | Gemini 1.0 Pro gemini-1.0-pro multimodalvisionmulti-input reasoning | Google | 3.2 Benchmarks | 3.2 | 57.2 | 0.0 | 0.0 | 55.4 | |
| 229 | Llama 3.1 8B Instruct llama-3.1-8b-instruct textinference | Meta | 3.2 Benchmarks | 3.2 | 26.7 | 0.0 | 0.0 | 83.9 | $0.03 in / $0.03 out |
| 230 | Phi-3.5-mini-instruct phi-3.5-mini-instruct multimodalvisionmulti-input reasoning | Microsoft | 2.7 Benchmarks | 2.7 | 10.8 | 0.0 | 0.0 | 77.2 | $0.1 in / $0.1 out |
| 231 | GPT-3.5 Turbo gpt-3.5-turbo-0125 multimodalvisionmulti-input reasoning | OpenAI | 2.5 Benchmarks | 2.5 | 36.7 | 0.0 | 0.0 | 49.4 | |
| 232 | Qwen2 7B Instruct qwen2-7b-instruct textinference | Alibaba Cloud / Qwen Team | 2.4 Benchmarks | 2.4 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 233 | Phi-3.5-vision-instruct phi-3.5-vision-instruct multimodalvisionmulti-input reasoning | Microsoft | 2.3 Benchmarks | 2.3 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 234 | Phi 4 Mini phi-4-mini textinference | Microsoft | 2.0 Benchmarks | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
| 235 | Gemma 3n E4B Instructed gemma-3n-e4b-it multimodalvisionmulti-input reasoning | Google | 1.3 Benchmarks | 1.3 | 20.3 | 0.0 | 0.0 | 10.3 | |
| 236 | Gemma 3n E4B Instructed LiteRT Preview gemma-3n-e4b-it-litert-preview multimodalvisionmulti-input reasoning | Google | 1.3 Benchmarks | 1.3 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 237 | DeepSeek VL2 Tiny deepseek-vl2-tiny multimodalvisionmulti-input reasoning | DeepSeek | 1.2 Benchmarks | 1.2 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 238 | Gemma 3n E2B Instructed gemma-3n-e2b-it multimodalvisionmulti-input reasoning | Google | 1.0 Benchmarks | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 239 | Gemma 3n E2B Instructed LiteRT (Preview) gemma-3n-e2b-it-litert-preview multimodalvisionmulti-input reasoning | Google | 1.0 Benchmarks | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| 240 | Gemma 3 1B gemma-3-1b-it textinference | Google | 0.9 Benchmarks | 0.9 | 0.0 | 0.0 | 0.0 | 0.0 | N/A |
Claude 3 Haiku
Anthropic
5.8
$0.25 in / $1.25 out
Llama 3.2 3B Instruct
Meta
5.2
$0.01 in / $0.02 out
Jamba 1.5 Mini
AI21 Labs
4.7
$0.2 in / $0.4 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| N/A |
| N/A |
| $0.5 in / $1.5 out |
| $0.5 in / $1.5 out |
| $20 in / $40 out |
| N/A |
| N/A |
| N/A |
| N/A |
DeepSeek VL2 Small
DeepSeek
4.6
N/A
Gemma 3 4B
4.5
$0.02 in / $0.04 out
GPT-5.1 Codex Mini
OpenAI
4.0
N/A
Llama 3.2 11B Instruct
Meta
4.0
$0.05 in / $0.05 out
Gemini 1.0 Pro
3.2
$0.5 in / $1.5 out
Llama 3.1 8B Instruct
Meta
3.2
$0.03 in / $0.03 out
Phi-3.5-mini-instruct
Microsoft
2.7
$0.1 in / $0.1 out
GPT-3.5 Turbo
OpenAI
2.5
$0.5 in / $1.5 out
Qwen2 7B Instruct
Alibaba Cloud / Qwen Team
2.4
N/A
Phi-3.5-vision-instruct
Microsoft
2.3
N/A
Phi 4 Mini
Microsoft
2.0
N/A
Gemma 3n E4B Instructed
1.3
$20 in / $40 out
Gemma 3n E4B Instructed LiteRT Preview
1.3
N/A
DeepSeek VL2 Tiny
DeepSeek
1.2
N/A
Gemma 3n E2B Instructed
1.0
N/A
Gemma 3n E2B Instructed LiteRT (Preview)
1.0
N/A
Gemma 3 1B
0.9
N/A