Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
294
Tracked models
27
Providers
251
Benchmarked
30.7
Avg. index
294 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 21 | Gemma 3 4B gemma-3-4b-it multimodalvisionmulti-input reasoning | Google | 81.8 Value / Price | 4.6 | 19.9 | 0.0 | 0.0 | 81.8 | $0.02 in / $0.04 out |
| 22 | Qwen2.5-Coder 32B Instruct qwen-2.5-coder-32b-instruct textinference | Alibaba Cloud / Qwen Team | 81.2 Value / Price | 0.0 | 20.9 | 0.0 | 0.0 | 81.2 | $0.09 in / $0.09 out |
| 23 | Gemma 3 12B gemma-3-12b-it multimodalvisionmulti-input reasoning | Google | 80.5 Value / Price | 9.3 | 19.9 | 0.0 | 0.0 | 80.5 | |
| 24 | Mistral Small 3 24B Instruct mistral-small-24b-instruct-2501 textinference | Mistral AI | 80.5 Value / Price | 14.4 | 20.7 | 0.0 | 0.0 | 80.5 | $0.07 in / $0.14 out |
| 25 | Gemini 2.0 Flash-Lite gemini-2.0-flash-lite multimodalvisionmulti-input reasoning | Google | 79.7 Value / Price | 25.7 | 63.2 | 0.0 | 0.0 | 79.7 | |
| 26 | Phi-4-multimodal-instruct phi-4-multimodal-instruct multimodalvisionmulti-input reasoning | Microsoft | 79.7 Value / Price | 8.8 | 11.8 | 0.0 | 0.0 | 79.7 | |
| 27 | DeepSeek-V2.5 deepseek-v2.5 codeprogrammingtool use | DeepSeek | 79.5 Value / Price | 0.0 | 45.6 | 0.0 | 0.9 | 79.5 | $0.14 in / $0.28 out |
| 28 | GPT OSS 20B gpt-oss-20b textinference | OpenAI | 79.3 Value / Price | 26.1 | 77.3 | 6.0 | 0.0 | 79.3 | $0.1 in / $0.5 out |
| 29 | Gemma 4 26B-A4B gemma-4-26b-a4b-it multimodalvisionmulti-input reasoning | Google | 77.8 Value / Price | 43.7 | 66.8 | 0.0 | 0.0 | 77.8 | |
| 30 | Qwen2.5 7B Instruct qwen-2.5-7b-instruct textinference | Alibaba Cloud / Qwen Team | 77.3 Value / Price | 7.5 | 70.8 | 0.0 | 0.0 | 77.3 | $0.3 in / $0.3 out |
| 31 | Mistral NeMo Instruct mistral-nemo-instruct-2407 textinference | Mistral AI | 77.0 Value / Price | 0.0 | 20.9 | 0.0 | 0.0 | 77.0 | $0.15 in / $0.15 out |
| 32 | Phi-3.5-mini-instruct phi-3.5-mini-instruct multimodalvisionmulti-input reasoning | Microsoft | 77.0 Value / Price | 2.7 | 10.3 | 0.0 | 0.0 | 77.0 | $0.1 in / $0.1 out |
| 33 | Phi 4 phi-4 textinference | Microsoft | 76.9 Value / Price | 15.8 | 8.5 | 0.0 | 0.0 | 76.9 | $0.07 in / $0.14 out |
| 34 | Gemma 4 31B gemma-4-31b-it multimodalvisionmulti-input reasoning | Google | 76.7 Value / Price | 56.5 | 66.8 | 0.0 | 0.0 | 76.7 | |
| 35 | GPT OSS 120B gpt-oss-120b textinference | OpenAI | 76.7 Value / Price | 36.6 | 34.9 | 26.8 | 0.0 | 76.7 | $0.09 in / $0.45 out |
| 36 | Qwen3 30B A3B qwen3-30b-a3b textinference | Alibaba Cloud / Qwen Team | 76.6 Value / Price | 25.8 | 36.4 | 0.0 | 0.0 | 76.6 | $0.1 in / $0.3 out |
| 37 | Ministral 8B Instruct ministral-8b-instruct-2410 textinference | Mistral AI | 76.0 Value / Price | 0.0 | 7.1 | 0.0 | 0.0 | 76.0 | $0.1 in / $0.1 out |
| 38 | DeepSeek R1 Distill Qwen 32B deepseek-r1-distill-qwen-32b textinference | DeepSeek | 75.6 Value / Price | 26.8 | 16.1 | 0.0 | 0.0 | 75.6 | $0.12 in / $0.18 out |
| 39 | Qwen3 VL 8B Instruct qwen3-vl-8b-instruct multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 75.6 Value / Price | 9.8 | 66.8 | 26.7 | 0.0 | 75.6 | |
| 40 | Gemma 3 27B gemma-3-27b-it multimodalvisionmulti-input reasoning | Google | 73.6 Value / Price | 11.0 | 19.9 | 0.0 | 0.0 | 73.6 |
Gemma 3 4B
81.8
$0.02 in / $0.04 out
Qwen2.5-Coder 32B Instruct
Alibaba Cloud / Qwen Team
81.2
$0.09 in / $0.09 out
Gemma 3 12B
80.5
$0.05 in / $0.1 out
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $0.05 in / $0.1 out |
| $0.07 in / $0.3 out |
| $0.05 in / $0.1 out |
| $0.13 in / $0.4 out |
| $0.14 in / $0.4 out |
| $0.08 in / $0.5 out |
| $0.1 in / $0.2 out |
Mistral Small 3 24B Instruct
Mistral AI
80.5
$0.07 in / $0.14 out
Gemini 2.0 Flash-Lite
79.7
$0.07 in / $0.3 out
Phi-4-multimodal-instruct
Microsoft
79.7
$0.05 in / $0.1 out
DeepSeek-V2.5
DeepSeek
79.5
$0.14 in / $0.28 out
GPT OSS 20B
OpenAI
79.3
$0.1 in / $0.5 out
Gemma 4 26B-A4B
77.8
$0.13 in / $0.4 out
Qwen2.5 7B Instruct
Alibaba Cloud / Qwen Team
77.3
$0.3 in / $0.3 out
Mistral NeMo Instruct
Mistral AI
77.0
$0.15 in / $0.15 out
Phi-3.5-mini-instruct
Microsoft
77.0
$0.1 in / $0.1 out
Phi 4
Microsoft
76.9
$0.07 in / $0.14 out
Gemma 4 31B
76.7
$0.14 in / $0.4 out
GPT OSS 120B
OpenAI
76.7
$0.09 in / $0.45 out
Qwen3 30B A3B
Alibaba Cloud / Qwen Team
76.6
$0.1 in / $0.3 out
Ministral 8B Instruct
Mistral AI
76.0
$0.1 in / $0.1 out
DeepSeek R1 Distill Qwen 32B
DeepSeek
75.6
$0.12 in / $0.18 out
Qwen3 VL 8B Instruct
Alibaba Cloud / Qwen Team
75.6
$0.08 in / $0.5 out
Gemma 3 27B
73.6
$0.1 in / $0.2 out