AI Model Leaderboard — Skytells

Skytells

Rank	Model	Provider	Score	Benchmarks	Inference	Agentic	Programming	Value	Price
101	Gemini 2.5 Pro Preview 06-05 gemini-2.5-pro-preview-06-05 multimodalvisionmulti-input reasoning	Google	44.2 overall	51.2	62.8	0.0	29.3	27.6	$1.25 in / $10 out
102	Qwen3 VL 235B A22B Thinking qwen3-vl-235b-a22b-thinking multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	44.2 overall	37.7	66.0	40.2	0.0	37.4
103	Nova Lite nova-lite multimodalvisionmulti-input reasoning	Amazon	44.0 overall	13.5	70.5	0.0	0.0	86.7	$0.06 in / $0.24 out
104	Grok Code Fast 1 grok-code-fast-1 codeprogrammingtool use	xAI	44.0 overall	0.0	47.7	0.0	38.8	49.7	$0.2 in / $1.5 out
105	Devstral Medium devstral-medium-2507 codeprogrammingtool use	Mistral AI	43.8 overall	0.0	64.8	0.0	24.2	53.4	$0.4 in / $2 out
106	Qwen3-Coder 480B A35B Instruct qwen3-coder-480b-a35b-instruct codeprogrammingtool use	Alibaba Cloud / Qwen Team	43.6 overall	0.0	0.0	50.7	35.8	0.0
107	Qwen3-235B-A22B-Thinking-2507 qwen3-235b-a22b-thinking-2507 textinference	Alibaba Cloud / Qwen Team	43.5 overall	46.4	66.0	26.8	0.0	39.6	$0.3 in / $3 out
108	GPT-5.4 Mini gpt-5.4-mini texttext-to-textlanguage	OpenAI	43.3 overall	56.8	76.5	23.8	28.1	32.4
109	Mistral NeMo Instruct mistral-nemo-instruct-2407 textinference	Mistral AI	42.9 overall	0.0	21.4	0.0	0.0	77.3	$0.15 in / $0.15 out
110	GPT-5.3 Chat gpt-5.3-chat-latest multimodalvisionmulti-input reasoning	OpenAI	42.6 overall	0.0	52.7	0.0	0.0	26.5
111	LongCat-Flash-Chat longcat-flash-chat codeprogrammingtool use	Meituan	42.4 overall	27.9	52.7	49.2	39.1	57.9	$0.3 in / $1.2 out
112	Mistral Small 3.1 24B Base mistral-small-3.1-24b-base-2503 multimodalvisionmulti-input reasoning	Mistral AI	42.0 overall	13.4	64.8	0.0	0.0	85.3
113	GLM-4.6 glm-4.6 multimodalvisionmulti-input reasoning	Zhipu AI	41.8 overall	46.5	34.5	37.3	45.7	42.9	$0.55 in / $2.19 out
114	Llama 3.2 3B Instruct llama-3.2-3b-instruct textinference	Meta	41.4 overall	5.2	68.9	0.0	0.0	98.8	$0.01 in / $0.02 out
115	Qwen3 235B A22B qwen3-235b-a22b multimodalvisionmulti-input reasoning	Alibaba Cloud / Qwen Team	41.3 overall	30.5	33.5	0.0	0.0	84.0	$0.1 in / $0.1 out
116	Command R+ command-r-plus-04-2024 textinference	Cohere	41.3 overall	0.0	32.5	0.0	0.0	55.4	$0.25 in / $1 out
117	LongCat-Flash-Lite longcat-flash-lite codeprogrammingtool use	Meituan	41.1 overall	24.5	83.6	29.5	25.1	83.1	$0.1 in / $0.4 out
118	DeepSeek-V3.2-Exp deepseek-v3.2-exp codeprogrammingtool use	DeepSeek	41.0 overall	52.3	0.0	28.6	40.1	0.0	N/A
119	GPT-5.2 Codex gpt-5.2-codex multimodalvisionmulti-input reasoning	OpenAI	40.6 overall	0.0	49.0	0.0	44.1	19.6	$1.75 in / $14 out
120	Gemini 2.5 Pro gemini-2.5-pro multimodalvisionmulti-input reasoning	Google	40.4 overall	44.2	62.8	0.0	25.0	27.6