AI Model Leaderboard — Skytells

Skytells

Rank	Model	Provider	Score	Benchmarks	Inference	Agentic	Programming	Value	Price
161	GPT-4.1 nano gpt-4.1-nano-2025-04-14 multimodalvisionmulti-input reasoning	OpenAI	34.2 overall	12.5	93.4	0.0	0.0	82.7	$0.1 in / $0.4 out
162	Nemotron 3 Nano (30B A3B) nemotron-3-nano-30b-a3b codeprogrammingtool use	NVIDIA	34.1 overall	45.4	66.0	3.3	4.4	90.9	$0.06 in / $0.24 out
163	Qwen3.6-35B-A3B qwen3.6-35b-a3b multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	33.7 overall	55.3	0.0	15.5	26.0	0.0	N/A
164	MiniMax M1 80K minimax-m1-80k codeprogrammingtool use	MiniMax	33.6 overall	24.2	84.0	20.9	19.0	41.8	$0.55 in / $2.2 out
165	Ministral 8B Instruct ministral-8b-instruct-2410 textinference	Mistral AI	33.6 overall	0.0	7.0	0.0	0.0	76.1	$0.1 in / $0.1 out
166	o3 o3-2025-04-16 multimodalvisionmulti-input reasoning	OpenAI	33.2 overall	46.0	38.9	19.6	30.2	27.7	$2 in / $8 out
167	DeepSeek-V3 deepseek-v3 codeprogrammingtool use	DeepSeek	33.2 overall	27.3	58.0	0.0	10.4	60.5	$0.27 in / $1.1 out
168	DeepSeek-V3.1 deepseek-v3.1 codeprogrammingtool use	DeepSeek	32.9 overall	38.4	39.8	15.2	28.3	58.8	$0.27 in / $1 out
169	DeepSeek R1 Distill Qwen 32B deepseek-r1-distill-qwen-32b textinference	DeepSeek	32.7 overall	26.6	16.6	0.0	0.0	75.9	$0.12 in / $0.18 out
170	DeepSeek-V2.5 deepseek-v2.5 codeprogrammingtool use	DeepSeek	32.5 overall	0.0	46.5	0.0	0.9	79.7	$0.14 in / $0.28 out
171	DeepSeek R1 Distill Llama 70B deepseek-r1-distill-llama-70b textinference	DeepSeek	32.2 overall	28.8	16.6	0.0	0.0	66.6	$0.1 in / $0.4 out
172	Qwen3.5-4B qwen3.5-4b multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	32.1 overall	32.1	0.0	0.0	0.0	0.0	N/A
173	Claude 3 Haiku claude-3-haiku-20240307 multimodalvisionmulti-input reasoning	Anthropic	32.0 overall	5.8	61.8	0.0	0.0	57.9
174	Mistral Large 3 (675B Instruct 2512) mistral-large-latest multimodalvisionmulti-input reasoning	Mistral AI	31.6 overall	22.2	40.1	0.0	0.0	44.5
175	Phi 4 Reasoning Plus phi-4-reasoning-plus textinference	Microsoft	31.5 overall	31.5	0.0	0.0	0.0	0.0	N/A
176	Hermes 3 70B hermes-3-70b textinference	Nous Research	30.1 overall	30.1	0.0	0.0	0.0	0.0	N/A
177	Grok-2 grok-2 multimodalvisionmulti-input reasoning	xAI	30.1 overall	27.1	38.3	0.0	0.0	25.4	$2 in / $10 out
178	GLM-4.7-Flash glm-4.7-flash codeprogrammingtool use	Zhipu AI	29.9 overall	38.2	29.7	11.4	20.7	72.1	$0.07 in / $0.4 out
179	GPT-4o gpt-4o-2024-05-13 multimodalvisionmulti-input reasoning	OpenAI	29.9 overall	22.3	45.4	0.0	0.0	26.5	$2.5 in / $10 out
180	Llama 3.3 70B Instruct llama-3.3-70b-instruct textinference	Meta	29.9 overall	19.6	21.4	0.0	0.0	72.2	$0.2 in / $0.2 out

161

GPT-4.1 nano

OpenAI

34.2

$0.1 in / $0.4 out

162

Nemotron 3 Nano (30B A3B)

NVIDIA

34.1

$0.06 in / $0.24 out

163

Qwen3.6-35B-A3B

Alibaba Cloud / Qwen Team

33.7

N/A

164

Page 9 of 15 · 296 models

Previous Next