Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.
294
Tracked models
27
Providers
251
Benchmarked
30.7
Avg. index
294 models
| Rank | Model | Provider | Score | Benchmarks | Inference | Agentic | Programming | Value | Price |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Llama 3.2 3B Instruct llama-3.2-3b-instruct textinference | Meta | 98.8 Value / Price | 5.3 | 69.0 | 0.0 | 0.0 | 98.8 | $0.01 in / $0.02 out |
| 2 | Min istral 3 (3B Reasoning 2512) ministral-3b-latest multimodalvisionmulti-input reasoning | Mistral AI | 95.8 Value / Price | 22.1 | 79.7 | 0.0 | 0.0 | 95.8 | |
| 3 | Llama 3.2 11B Instruct llama-3.2-11b-instruct multimodalvisionmulti-input reasoning | Meta | 94.9 Value / Price | 4.1 | 60.5 | 0.0 | 0.0 | 94.9 | |
| 4 | Ministral 3 (8B Reasoning 2512) ministral-8b-latest multimodalvisionmulti-input reasoning | Mistral AI | 92.1 Value / Price | 31.8 | 84.8 | 0.0 | 0.0 | 92.1 | |
| 5 | Nova Micro nova-micro textinference | Amazon | 91.0 Value / Price | 9.2 | 51.9 | 0.0 | 0.0 | 91.0 | $0.03 in / $0.14 out |
| 6 | Nemotron 3 Nano (30B A3B) nemotron-3-nano-30b-a3b codeprogrammingtool use | NVIDIA | 90.8 Value / Price | 45.8 | 66.8 | 3.3 | 4.4 | 90.8 | $0.06 in / $0.24 out |
| 7 | Gemini 1.5 Flash 8B gemini-1.5-flash-8b multimodalvisionmulti-input reasoning | Google | 88.3 Value / Price | 10.4 | 92.1 | 0.0 | 0.0 | 88.3 | |
| 8 | Qwen3-Coder qwen3-coder textinference | Alibaba Cloud / Qwen Team | 88.3 Value / Price | 0.0 | 55.9 | 0.0 | 0.0 | 88.3 | $0.18 in / $0.18 out |
| 9 | Llama 4 Scout llama-4-scout multimodalvisionmulti-input reasoning | Meta | 87.2 Value / Price | 29.2 | 93.0 | 0.0 | 0.0 | 87.2 | $0.08 in / $0.3 out |
| 10 | Nova Lite nova-lite multimodalvisionmulti-input reasoning | Amazon | 86.4 Value / Price | 13.6 | 69.9 | 0.0 | 0.0 | 86.4 | $0.06 in / $0.24 out |
| 11 | MiMo-V2-Flash mimo-v2-flash codeprogrammingtool use | Xiaomi | 85.9 Value / Price | 53.7 | 79.8 | 27.2 | 39.3 | 85.9 | $0.1 in / $0.3 out |
| 12 | Devstral Small 1.1 devstral-small-2507 codeprogrammingtool use | Mistral AI | 85.0 Value / Price | 0.0 | 64.5 | 0.0 | 15.0 | 85.0 | |
| 13 | Mistral Small 3.1 24B Base mistral-small-3.1-24b-base-2503 multimodalvisionmulti-input reasoning | Mistral AI | 85.0 Value / Price | 13.5 | 64.5 | 0.0 | 0.0 | 85.0 | |
| 14 | Ministral 3 (14B Reasoning 2512) ministral-14b-latest multimodalvisionmulti-input reasoning | Mistral AI | 84.5 Value / Price | 37.9 | 76.8 | 0.0 | 0.0 | 84.5 | |
| 15 | Qwen3 235B A22B qwen3-235b-a22b multimodalvisionmulti-input reasoning | Alibaba Cloud / Qwen Team | 83.9 Value / Price | 30.5 | 33.4 | 0.0 | 0.0 | 83.9 | $0.1 in / $0.1 out |
| 16 | Llama 3.1 8B Instruct llama-3.1-8b-instruct textinference | Meta | 83.7 Value / Price | 3.2 | 26.2 | 0.0 | 0.0 | 83.7 | $0.03 in / $0.03 out |
| 17 | LongCat-Flash-Lite longcat-flash-lite codeprogrammingtool use | Meituan | 83.3 Value / Price | 24.7 | 83.8 | 29.5 | 25.3 | 83.3 | |
| 18 | GPT-4.1 nano gpt-4.1-nano-2025-04-14 multimodalvisionmulti-input reasoning | OpenAI | 83.0 Value / Price | 12.6 | 93.7 | 0.0 | 0.0 | 83.0 | |
| 19 | Gemini 2.0 Flash gemini-2.0-flash multimodalvisionmulti-input reasoning | Google | 82.7 Value / Price | 33.4 | 94.1 | 0.0 | 0.0 | 82.7 | |
| 20 | Step-3.5-Flash step-3.5-flash codeprogrammingtool use | StepFun | 82.1 Value / Price | 62.3 | 63.2 | 45.3 | 53.0 | 82.1 | $0.1 in / $0.4 out |
Llama 3.2 3B Instruct
Meta
98.8
$0.01 in / $0.02 out
Min istral 3 (3B Reasoning 2512)
Mistral AI
95.8
$0.1 in / $0.1 out
Llama 3.2 11B Instruct
Meta
94.9
$0.05 in / $0.05 out
Page 1 of 15 · 294 models
Want benchmark charts, model comparison, and pricing analytics?
Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.
Open full leaderboardRankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.
| $0.1 in / $0.1 out |
| $0.05 in / $0.05 out |
| $0.15 in / $0.15 out |
| $0.07 in / $0.3 out |
| $0.1 in / $0.3 out |
| $0.1 in / $0.3 out |
| $0.2 in / $0.2 out |
| $0.1 in / $0.4 out |
| $0.1 in / $0.4 out |
| $0.1 in / $0.4 out |
Ministral 3 (8B Reasoning 2512)
Mistral AI
92.1
$0.15 in / $0.15 out
Nova Micro
Amazon
91.0
$0.03 in / $0.14 out
Nemotron 3 Nano (30B A3B)
NVIDIA
90.8
$0.06 in / $0.24 out
Gemini 1.5 Flash 8B
88.3
$0.07 in / $0.3 out
Qwen3-Coder
Alibaba Cloud / Qwen Team
88.3
$0.18 in / $0.18 out
Llama 4 Scout
Meta
87.2
$0.08 in / $0.3 out
Nova Lite
Amazon
86.4
$0.06 in / $0.24 out
MiMo-V2-Flash
Xiaomi
85.9
$0.1 in / $0.3 out
Devstral Small 1.1
Mistral AI
85.0
$0.1 in / $0.3 out
Mistral Small 3.1 24B Base
Mistral AI
85.0
$0.1 in / $0.3 out
Ministral 3 (14B Reasoning 2512)
Mistral AI
84.5
$0.2 in / $0.2 out
Qwen3 235B A22B
Alibaba Cloud / Qwen Team
83.9
$0.1 in / $0.1 out
Llama 3.1 8B Instruct
Meta
83.7
$0.03 in / $0.03 out
LongCat-Flash-Lite
Meituan
83.3
$0.1 in / $0.4 out
GPT-4.1 nano
OpenAI
83.0
$0.1 in / $0.4 out
Gemini 2.0 Flash
82.7
$0.1 in / $0.4 out
Step-3.5-Flash
StepFun
82.1
$0.1 in / $0.4 out