Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

296

Tracked models

27

Providers

253

Benchmarked

27.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

296 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
261

Granite 3.3 8B Instruct

granite-3.3-8b-instruct

multimodalvisionmulti-input reasoning
IIBM

0.0

Benchmarks

0.029.20.00.056.3$0.5 in / $0.5 out
262

IBM Granite 4.0 Tiny Preview

granite-4.0-tiny-preview

textinference
IIBM

0.0

Benchmarks

0.00.00.00.00.0N/A
263

Grok-2 Image 1212

grok-2-image-1212

textinference
xAI

0.0

Benchmarks

0.00.00.00.00.0N/A
264

Grok-4.1

grok-4.1-2025-11-17

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.064.80.00.022.7
265

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.068.50.00.067.2
266

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.068.50.00.067.2
267

Grok-4.1 Thinking

grok-4.1-thinking-2025-11-17

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.047.90.00.017.6
268

Grok-4.20 Beta Non-Reasoning

grok-4.20-beta-0309-non-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.097.30.00.027.6
269

Grok-4.20 Beta Reasoning

grok-4.20-beta-0309-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.097.30.00.027.6
270

Grok-4.20 Multi-Agent Beta

grok-4.20-multi-agent-beta-0309

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.00.00.00.00.0
271

Grok-4 Fast Non-Reasoning

grok-4-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.068.50.00.067.2
272

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

0.0

Benchmarks

0.068.50.00.067.2
273

Grok Code Fast 1

grok-code-fast-1

codeprogrammingtool use
xAI

0.0

Benchmarks

0.047.10.038.849.7$0.2 in / $1.5 out
274

Llama 3.1 Nemotron 70B Instruct

llama-3.1-nemotron-70b-instruct

textinference
NNVIDIA

0.0

Benchmarks

0.00.00.00.00.0N/A
275

MedGemma 4B IT

medgemma-4b-it

multimodalvisionmulti-input reasoning
Google

0.0

Benchmarks

0.00.00.00.00.0
276

MiMo-V2-Omni

mimo-v2-omni

multimodalvisionmulti-input reasoning
Xiaomi

0.0

Benchmarks

0.058.20.054.445.1$0.4 in / $2 out
277

MiMo-V2-Pro

mimo-v2-pro

codeprogrammingtool use
Xiaomi

0.0

Benchmarks

0.084.10.065.136.9$1 in / $3 out
278

MiniMax M2.5

minimax-m2.5

codeprogrammingtool use
MiniMax

0.0

Benchmarks

0.074.552.257.458.1$0.3 in / $1.2 out
279

MiniMax M2.7

minimax-m2.7

codeprogrammingtool use
MiniMax

0.0

Benchmarks

0.051.944.940.155.2$0.3 in / $1.2 out
280

Ministral 3 (14B Base 2512)

ministral-3-14b-base-2512

multimodalvisionmulti-input reasoning
Mistral AI

0.0

Benchmarks

0.00.00.00.00.0
261
I

Granite 3.3 8B Instruct

IBM

0.0

$0.5 in / $0.5 out

262
I

IBM Granite 4.0 Tiny Preview

IBM

0.0

N/A

263

Grok-2 Image 1212

xAI

0.0

N/A

264

Page 14 of 15 · 296 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$3 in / $15 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$3 in / $15 out
$2 in / $6 out
$2 in / $6 out
N/A
$0.2 in / $0.5 out
$0.2 in / $0.5 out
N/A
N/A

Grok-4.1

xAI

0.0

$3 in / $15 out

265

Grok-4.1 Fast Non-Reasoning

xAI

0.0

$0.2 in / $0.5 out

266

Grok-4.1 Fast Reasoning

xAI

0.0

$0.2 in / $0.5 out

267

Grok-4.1 Thinking

xAI

0.0

$3 in / $15 out

268

Grok-4.20 Beta Non-Reasoning

xAI

0.0

$2 in / $6 out

269

Grok-4.20 Beta Reasoning

xAI

0.0

$2 in / $6 out

270

Grok-4.20 Multi-Agent Beta

xAI

0.0

N/A

271

Grok-4 Fast Non-Reasoning

xAI

0.0

$0.2 in / $0.5 out

272

Grok-4 Fast Reasoning

xAI

0.0

$0.2 in / $0.5 out

273

Grok Code Fast 1

xAI

0.0

$0.2 in / $1.5 out

274
N

Llama 3.1 Nemotron 70B Instruct

NVIDIA

0.0

N/A

275

MedGemma 4B IT

Google

0.0

N/A

276

MiMo-V2-Omni

Xiaomi

0.0

$0.4 in / $2 out

277

MiMo-V2-Pro

Xiaomi

0.0

$1 in / $3 out

278

MiniMax M2.5

MiniMax

0.0

$0.3 in / $1.2 out

279

MiniMax M2.7

MiniMax

0.0

$0.3 in / $1.2 out

280

Ministral 3 (14B Base 2512)

Mistral AI

0.0

N/A