AI Model Leaderboard — Skytells

Skytells

Rank	Model	Provider	Score	Benchmarks	Inference	Agentic	Programming	Value	Price
81	MiMo-V2-Flash mimo-v2-flash codeprogrammingtool use	Xiaomi	23.5 Agentic	50.6	0.0	23.5	35.8	0.0	N/A
82	Qwen3 VL 30B A3B Instruct qwen3-vl-30b-a3b-instruct multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	21.4 Agentic	25.6	0.0	21.4	0.0	0.0
83	Qwen3 VL 8B Thinking qwen3-vl-8b-thinking multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	21.1 Agentic	32.8	0.0	21.1	0.0	0.0	N/A
84	MiniMax M1 80K minimax-m1-80k codeprogrammingtool use	MiniMax	20.9 Agentic	22.8	0.0	20.9	16.2	0.0	N/A
85	Qwen3 VL 30B A3B Thinking qwen3-vl-30b-a3b-thinking multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	19.2 Agentic	32.7	0.0	19.2	0.0	0.0
86	o3 o3-2025-04-16 multimodalvisionmulti-input reasoning	OpenAI	17.9 Agentic	42.9	0.0	17.9	27.7	0.0	N/A
87	Qwen3-Next-80B-A3B-Instruct qwen3-next-80b-a3b-instruct textinference	Alibaba Cloud / Qwen Team	17.9 Agentic	27.4	0.0	17.9	0.0	0.0	N/A
88	Qwen3 VL 4B Instruct qwen3-vl-4b-instruct multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	17.7 Agentic	18.2	32.9	17.7	0.0	85.4
89	Qwen3 VL 4B Thinking qwen3-vl-4b-thinking multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	17.0 Agentic	20.2	32.9	17.0	0.0	79.3
90	Sarvam-105B sarvam-105b codeprogrammingtool use	Sarvam AI	16.7 Agentic	41.1	0.0	16.7	10.3	0.0	N/A
91	Mistral Medium 3.5 mistral-medium-3-5 multimodalvisionmulti-input reasoning	Mistral AI	15.4 Agentic	34.6	23.2	15.4	59.0	34.8
92	GPT-5.4 Mini gpt-5.4-mini texttext-to-textlanguage	OpenAI	15.0 Agentic	51.1	44.5	15.0	20.0	42.7
93	GPT-4o gpt-4o-2024-08-06 multimodalvisionmulti-input reasoning	OpenAI	14.9 Agentic	29.1	39.6	14.9	3.7	31.2
94	DeepSeek-V3.1 deepseek-v3.1 codeprogrammingtool use	DeepSeek	13.6 Agentic	36.7	0.0	13.6	25.4	0.0	N/A
95	Kimi K2 Instruct kimi-k2-instruct codeprogrammingtool use	Moonshot AI	13.5 Agentic	23.2	0.0	13.5	14.0	0.0	N/A
96	Nova 2 Lite nova-2-lite multimodalvisionmulti-input reasoning	Amazon	13.0 Agentic	41.1	62.8	13.0	26.0	59.1	$0.3 in / $2.5 out
97	Grok 4 Fast grok-4-fast multimodalvisionmulti-input reasoning	xAI	12.8 Agentic	55.7	0.0	12.8	0.0	0.0	N/A
98	DeepSeek-V3.2 (Thinking) deepseek-reasoner codeprogrammingtool use	DeepSeek	12.4 Agentic	49.8	0.0	12.4	42.0	0.0	N/A
99	DeepSeek-V3.2 deepseek-v3.2 codeprogrammingtool use	DeepSeek	12.4 Agentic	55.0	0.0	12.4	42.0	0.0	N/A
100	o3-mini o3-mini codeprogrammingtool use	OpenAI	11.9 Agentic	25.3	0.0	11.9	11.6	0.0	N/A

MiMo-V2-Flash

Xiaomi

23.5

N/A

Qwen3 VL 30B A3B Instruct

Alibaba Cloud / Qwen Team

21.4

N/A

Qwen3 VL 8B Thinking

Alibaba Cloud / Qwen Team

21.1

N/A

Page 5 of 17 · 334 models

Previous Next