Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboard Browse model catalog

334

Tracked models

Providers

286

Benchmarked

12.6

Avg. index

Overall Benchmarks Inference Agentic Programming Value / Price

Rank	Model	Provider	Score	Benchmarks	Inference	Agentic	Programming	Value	Price
1	Nemotron 3 Nano (30B A3B) nemotron-3-nano-30b-a3b codeprogrammingtool use	NVIDIA	100.0 Value / Price	43.6	32.9	3.0	3.8	100.0	$0.06 in / $0.24 out
2	DeepSeek-V4-Flash-Max deepseek-v4-flash-max codeprogrammingtool use	DeepSeek	98.8 Value / Price	56.2	84.8	35.3	41.4	98.8
3	LongCat-Flash-Lite longcat-flash-lite codeprogrammingtool use	Meituan	95.6 Value / Price	22.8	72.8	30.1	23.9	95.6
4	GPT-4.1 nano gpt-4.1-nano-2025-04-14 multimodalvisionmulti-input reasoning	OpenAI	94.9 Value / Price	11.6	87.8	0.0	0.0	94.9
5	Step-3.5-Flash step-3.5-flash codeprogrammingtool use	StepFun	93.9 Value / Price	62.8	59.8	36.5	48.6	93.9	$0.1 in / $0.4 out
6	MiMo-V2.5 mimo-v2.5 multimodalvisionmulti-input reasoning	Xiaomi	92.7 Value / Price	47.7	84.8	0.0	27.2	92.7	$0.168 in / $0.336 out
7	Gemma 4 31B gemma-4-31b-it multimodalvisionmulti-input reasoning	Google	91.5 Value / Price	55.4	32.9	0.0	0.0	91.5
8	Gemma 4 26B-A4B gemma-4-26b-a4b-it multimodalvisionmulti-input reasoning	Google	90.2 Value / Price	43.8	32.9	0.0	0.0	90.2
9	Qwen3 VL 4B Instruct qwen3-vl-4b-instruct multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	85.4 Value / Price	18.2	32.9	17.7	0.0	85.4
10	Mercury 2 mercury-2 codeprogrammingtool use	Inception	84.4 Value / Price	42.7	68.9	0.0	15.3	84.4	$0.25 in / $0.75 out
11	DeepSeek-V3.2 (Non-thinking) deepseek-chat textinference	DeepSeek	82.7 Value / Price	0.0	52.0	0.0	0.0	82.7	$0.28 in / $0.42 out
12	Mistral Small 4 mistral-small-latest multimodalvisionmulti-input reasoning	Mistral AI	81.7 Value / Price	31.3	23.2	0.0	0.0	81.7
13	Qwen3 VL 4B Thinking qwen3-vl-4b-thinking multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	79.3 Value / Price	20.2	32.9	17.0	0.0	79.3
14	Qwen3 30B A3B qwen3-30b-a3b textinference	Alibaba Cloud / Qwen Team	78.5 Value / Price	23.7	26.6	0.0	0.0	78.5	$0.1 in / $0.44 out
15	MiMo-V2.5-Pro mimo-v2.5-pro codeprogrammingtool use	Xiaomi	78.0 Value / Price	36.2	84.8	0.0	56.3	78.0	$0.435 in / $0.87 out
16	Qwen3 32B qwen3-32b textinference	Alibaba Cloud / Qwen Team	78.0 Value / Price	20.1	2.2	0.0	0.0	78.0	$0.1 in / $0.3 out
17	Grok-4.1 Fast Non-Reasoning grok-4-1-fast-non-reasoning multimodalvisionmulti-input reasoning	xAI	77.3 Value / Price	0.0	63.0	0.0	0.0	77.3
18	Grok-4.1 Fast Reasoning grok-4-1-fast-reasoning multimodalvisionmulti-input reasoning	xAI	77.3 Value / Price	0.0	63.0	0.0	0.0	77.3
19	Grok-4 Fast Reasoning grok-4-fast-reasoning multimodalvisionmulti-input reasoning	xAI	77.3 Value / Price	0.0	63.0	0.0	0.0	77.3
20	GPT-5.4 nano gpt-5.4-nano multimodalvisionmulti-input reasoning	OpenAI	76.8 Value / Price	41.8	44.5	6.9	8.2	76.8

Rank

Model

Provider

Score

Price

Nemotron 3 Nano (30B A3B)

nemotron-3-nano-30b-a3b

codeprogrammingtool use

NVIDIA

100.0

Value / Price

$0.06 in / $0.24 out

DeepSeek-V4-Flash-Max

deepseek-v4-flash-max

codeprogrammingtool use

DeepSeek

98.8

Value / Price

LongCat-Flash-Lite

longcat-flash-lite

codeprogrammingtool use

Meituan

95.6

Value / Price

GPT-4.1 nano

gpt-4.1-nano-2025-04-14

multimodalvisionmulti-input reasoning

OpenAI

94.9

Value / Price

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use

StepFun

93.9

Value / Price

$0.1 in / $0.4 out

MiMo-V2.5

mimo-v2.5

multimodalvisionmulti-input reasoning

Xiaomi

92.7

Value / Price

$0.168 in / $0.336 out

Gemma 4 31B

gemma-4-31b-it

multimodalvisionmulti-input reasoning

Google

91.5

Value / Price

Gemma 4 26B-A4B

gemma-4-26b-a4b-it

multimodalvisionmulti-input reasoning

Google

90.2

Value / Price

Qwen3 VL 4B Instruct

qwen3-vl-4b-instruct

multimodalvisionmulti-input reasoning

Alibaba Cloud / Qwen Team

85.4

Value / Price

Mercury 2

mercury-2

codeprogrammingtool use

Inception

84.4

Value / Price

$0.25 in / $0.75 out

DeepSeek-V3.2 (Non-thinking)

deepseek-chat

textinference

DeepSeek

82.7

Value / Price

$0.28 in / $0.42 out

Mistral Small 4

mistral-small-latest

multimodalvisionmulti-input reasoning

Mistral AI

81.7

Value / Price

Qwen3 VL 4B Thinking

qwen3-vl-4b-thinking

multimodalvisionmulti-input reasoning

Alibaba Cloud / Qwen Team

79.3

Value / Price

Qwen3 30B A3B

qwen3-30b-a3b

textinference

Alibaba Cloud / Qwen Team

78.5

Value / Price

$0.1 in / $0.44 out

MiMo-V2.5-Pro

mimo-v2.5-pro

codeprogrammingtool use

Xiaomi

78.0

Value / Price

$0.435 in / $0.87 out

Qwen3 32B

qwen3-32b

textinference

Alibaba Cloud / Qwen Team

78.0

Value / Price

$0.1 in / $0.3 out

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning

xAI

77.3

Value / Price

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning

xAI

77.3

Value / Price

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning

xAI

77.3

Value / Price

GPT-5.4 nano

gpt-5.4-nano

multimodalvisionmulti-input reasoning

OpenAI

76.8

Value / Price