Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

11.5

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
201

K-EXAONE-236B-A23B

k-exaone-236b-a23b

multimodalvisionmulti-input reasoning
LLG AI Research

0.0

Agentic

43.424.90.00.049.2$0.6 in / $1 out
202

Kimi-k1.5

kimi-k1.5

multimodalvisionmulti-input reasoning
Moonshot AI

0.0

Agentic

35.30.00.00.00.0N/A
203

Kimi K2 0905

kimi-k2-0905

textinference
Moonshot AI

0.0

Agentic

44.066.00.00.040.1$0.6 in / $2.5 out
204

Kimi K2 Base

kimi-k2-base

textinference
Moonshot AI

0.0

Agentic

26.90.00.00.00.0N/A
205

Llama 3.1 405B Instruct

llama-3.1-405b-instruct

textinference
MMeta

0.0

Agentic

20.021.40.00.044.5$0.89 in / $0.89 out
206

Llama 3.1 70B Instruct

llama-3.1-70b-instruct

textinference
MMeta

0.0

Agentic

11.221.40.00.072.2$0.2 in / $0.2 out
207

Llama 3.1 8B Instruct

llama-3.1-8b-instruct

textinference
MMeta

0.0

Agentic

3.226.70.00.083.9$0.03 in / $0.03 out
208

Llama 3.1 Nemotron 70B Instruct

llama-3.1-nemotron-70b-instruct

textinference
NNVIDIA

0.0

Agentic

0.00.00.00.00.0N/A
209

Llama 3.1 Nemotron Nano 8B V1

llama-3.1-nemotron-nano-8b-v1

textinference
NNVIDIA

0.0

Agentic

16.30.00.00.00.0N/A
210

Llama 3.1 Nemotron Ultra 253B v1

llama-3.1-nemotron-ultra-253b-v1

textinference
NNVIDIA

0.0

Agentic

35.40.00.00.00.0N/A
211

Llama 3.2 11B Instruct

llama-3.2-11b-instruct

multimodalvisionmulti-input reasoning
MMeta

0.0

Agentic

4.060.30.00.094.9$0.05 in / $0.05 out
212

Llama 3.2 3B Instruct

llama-3.2-3b-instruct

textinference
MMeta

0.0

Agentic

5.268.90.00.098.8$0.01 in / $0.02 out
213

Llama 3.2 90B Instruct

llama-3.2-90b-instruct

multimodalvisionmulti-input reasoning
MMeta

0.0

Agentic

16.311.30.00.054.9$0.35 in / $0.4 out
214

Llama 3.3 70B Instruct

llama-3.3-70b-instruct

textinference
MMeta

0.0

Agentic

19.621.40.00.072.2$0.2 in / $0.2 out
215

Llama-3.3 Nemotron Super 49B v1

llama-3.3-nemotron-super-49b-v1

textinference
NNVIDIA

0.0

Agentic

23.00.00.00.00.0N/A
216

Llama 4 Maverick

llama-4-maverick

multimodalvisionmulti-input reasoning
MMeta

0.0

Agentic

35.455.80.00.057.1$0.17 in / $0.85 out
217

Llama 4 Scout

llama-4-scout

multimodalvisionmulti-input reasoning
MMeta

0.0

Agentic

29.062.10.00.078.1$0.08 in / $0.3 out
218

LongCat-Flash-Thinking

longcat-flash-thinking

codeprogrammingtool use
Meituan

0.0

Agentic

50.20.00.021.60.0
219

Magistral Medium

magistral-medium

multimodalvisionmulti-input reasoning
Mistral AI

0.0

Agentic

22.20.00.00.00.0
220

Magistral Small 2506

magistral-small-2506

textinference
Mistral AI

0.0

Agentic

24.50.00.00.00.0N/A
201
L

K-EXAONE-236B-A23B

LG AI Research

0.0

$0.6 in / $1 out

202

Kimi-k1.5

Moonshot AI

0.0

N/A

203

Kimi K2 0905

Moonshot AI

0.0

$0.6 in / $2.5 out

204

Page 11 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

N/A
N/A

Kimi K2 Base

Moonshot AI

0.0

N/A

205
M

Llama 3.1 405B Instruct

Meta

0.0

$0.89 in / $0.89 out

206
M

Llama 3.1 70B Instruct

Meta

0.0

$0.2 in / $0.2 out

207
M

Llama 3.1 8B Instruct

Meta

0.0

$0.03 in / $0.03 out

208
N

Llama 3.1 Nemotron 70B Instruct

NVIDIA

0.0

N/A

209
N

Llama 3.1 Nemotron Nano 8B V1

NVIDIA

0.0

N/A

210
N

Llama 3.1 Nemotron Ultra 253B v1

NVIDIA

0.0

N/A

211
M

Llama 3.2 11B Instruct

Meta

0.0

$0.05 in / $0.05 out

212
M

Llama 3.2 3B Instruct

Meta

0.0

$0.01 in / $0.02 out

213
M

Llama 3.2 90B Instruct

Meta

0.0

$0.35 in / $0.4 out

214
M

Llama 3.3 70B Instruct

Meta

0.0

$0.2 in / $0.2 out

215
N

Llama-3.3 Nemotron Super 49B v1

NVIDIA

0.0

N/A

216
M

Llama 4 Maverick

Meta

0.0

$0.17 in / $0.85 out

217
M

Llama 4 Scout

Meta

0.0

$0.08 in / $0.3 out

218

LongCat-Flash-Thinking

Meituan

0.0

N/A

219

Magistral Medium

Mistral AI

0.0

N/A

220

Magistral Small 2506

Mistral AI

0.0

N/A