Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

30.7

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
41

GPT OSS 120B High

gpt-oss-120b-high

multimodalvisionmulti-input reasoning
OpenAI

73.2

Value / Price

44.957.30.00.073.2$0.1 in / $0.5 out
42

Pixtral-12B

pixtral-12b-2409

multimodalvisionmulti-input reasoning
Mistral AI

72.9

Value / Price

8.17.10.00.072.9
43

Jamba 1.5 Mini

jamba-1.5-mini

textinference
AAI21 Labs

72.4

Value / Price

4.865.20.00.072.4$0.2 in / $0.4 out
44

GLM-4.7-Flash

glm-4.7-flash

codeprogrammingtool use
ZZhipu AI

72.2

Value / Price

38.529.112.021.272.2$0.07 in / $0.4 out
45

Gemini 1.5 Flash

gemini-1.5-flash

multimodalvisionmulti-input reasoning
Google

71.9

Value / Price

23.292.10.00.071.9
46

Llama 3.1 70B Instruct

llama-3.1-70b-instruct

textinference
MMeta

71.9

Value / Price

11.320.90.00.071.9$0.2 in / $0.2 out
47

Llama 3.3 70B Instruct

llama-3.3-70b-instruct

textinference
MMeta

71.9

Value / Price

19.820.90.00.071.9$0.2 in / $0.2 out
48

Qwen3 VL 4B Instruct

qwen3-vl-4b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

70.6

Value / Price

19.766.819.50.070.6
49

DeepSeek-V3.2 (Non-thinking)

deepseek-chat

textinference
DeepSeek

70.1

Value / Price

0.057.30.00.070.1$0.28 in / $0.42 out
50

DeepSeek-V3.2

deepseek-v3.2

codeprogrammingtool use
DeepSeek

70.0

Value / Price

58.152.516.645.970.0$0.26 in / $0.38 out
51

Mercury 2

mercury-2

codeprogrammingtool use
IInception

69.2

Value / Price

44.672.50.022.369.2$0.25 in / $0.75 out
52

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

67.2

Value / Price

0.068.20.00.067.2
53

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

67.2

Value / Price

0.068.20.00.067.2
54

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

67.2

Value / Price

58.068.215.40.067.2$0.2 in / $0.5 out
55

Grok-4 Fast Non-Reasoning

grok-4-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

67.2

Value / Price

0.068.20.00.067.2
56

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

67.2

Value / Price

0.068.20.00.067.2
57

Mistral Small 4

mistral-small-latest

multimodalvisionmulti-input reasoning
Mistral AI

66.9

Value / Price

34.855.90.00.066.9
58

DeepSeek R1 Distill Llama 70B

deepseek-r1-distill-llama-70b

textinference
DeepSeek

66.7

Value / Price

29.016.10.00.066.7$0.1 in / $0.4 out
59

GPT-4o mini

gpt-4o-mini-2024-07-18

multimodalvisionmulti-input reasoning
OpenAI

65.1

Value / Price

14.944.70.00.065.1
60

Grok-3 Mini

grok-3-mini

multimodalvisionmulti-input reasoning
xAI

65.0

Value / Price

53.451.90.00.065.0$0.3 in / $0.5 out
41

GPT OSS 120B High

OpenAI

73.2

$0.1 in / $0.5 out

42

Pixtral-12B

Mistral AI

72.9

$0.15 in / $0.15 out

43
A

Jamba 1.5 Mini

AI21 Labs

72.4

$0.2 in / $0.4 out

44

Page 3 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.15 in / $0.15 out
$0.15 in / $0.6 out
$0.1 in / $0.6 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.15 in / $0.6 out
$0.15 in / $0.6 out
Z

GLM-4.7-Flash

Zhipu AI

72.2

$0.07 in / $0.4 out

45

Gemini 1.5 Flash

Google

71.9

$0.15 in / $0.6 out

46
M

Llama 3.1 70B Instruct

Meta

71.9

$0.2 in / $0.2 out

47
M

Llama 3.3 70B Instruct

Meta

71.9

$0.2 in / $0.2 out

48
A

Qwen3 VL 4B Instruct

Alibaba Cloud / Qwen Team

70.6

$0.1 in / $0.6 out

49

DeepSeek-V3.2 (Non-thinking)

DeepSeek

70.1

$0.28 in / $0.42 out

50

DeepSeek-V3.2

DeepSeek

70.0

$0.26 in / $0.38 out

51
I

Mercury 2

Inception

69.2

$0.25 in / $0.75 out

52

Grok-4.1 Fast Non-Reasoning

xAI

67.2

$0.2 in / $0.5 out

53

Grok-4.1 Fast Reasoning

xAI

67.2

$0.2 in / $0.5 out

54

Grok 4 Fast

xAI

67.2

$0.2 in / $0.5 out

55

Grok-4 Fast Non-Reasoning

xAI

67.2

$0.2 in / $0.5 out

56

Grok-4 Fast Reasoning

xAI

67.2

$0.2 in / $0.5 out

57

Mistral Small 4

Mistral AI

66.9

$0.15 in / $0.6 out

58

DeepSeek R1 Distill Llama 70B

DeepSeek

66.7

$0.1 in / $0.4 out

59

GPT-4o mini

OpenAI

65.1

$0.15 in / $0.6 out

60

Grok-3 Mini

xAI

65.0

$0.3 in / $0.5 out