Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

31.8

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
21

GPT-5.4 nano

gpt-5.4-nano

multimodalvisionmulti-input reasoning
OpenAI

77.4

Inference

46.177.411.011.257.2$0.2 in / $1.25 out
22

GPT OSS 20B

gpt-oss-20b

textinference
OpenAI

77.3

Inference

26.177.36.00.079.3$0.1 in / $0.5 out
23

Ministral 3 (14B Reasoning 2512)

ministral-14b-latest

multimodalvisionmulti-input reasoning
Mistral AI

76.8

Inference

37.976.80.00.084.5
24

GPT-4.1

gpt-4.1-2025-04-14

multimodalvisionmulti-input reasoning
OpenAI

75.4

Inference

28.875.432.817.734.6
25

MiniMax M2.1

minimax-m2.1

codeprogrammingtool use
MiniMax

73.9

Inference

42.773.956.650.657.7$0.3 in / $1.2 out
26

MiniMax M2.5

minimax-m2.5

codeprogrammingtool use
MiniMax

73.9

Inference

0.073.953.056.357.7$0.3 in / $1.2 out
27

Mercury 2

mercury-2

codeprogrammingtool use
IInception

72.5

Inference

44.672.50.022.369.2$0.25 in / $0.75 out
28

GPT-5.1

gpt-5.1-2025-11-13

multimodalvisionmulti-input reasoning
OpenAI

71.4

Inference

65.071.40.057.231.9
29

GPT-5.1 Instant

gpt-5.1-instant-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

71.4

Inference

65.071.40.057.231.9
30

GPT-5.2

gpt-5.2-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

71.4

Inference

76.971.450.372.426.4
31

Qwen2.5 7B Instruct

qwen-2.5-7b-instruct

textinference
AAlibaba Cloud / Qwen Team

70.8

Inference

7.570.80.00.077.3$0.3 in / $0.3 out
32

o3-mini

o3-mini

codeprogrammingtool use
OpenAI

70.7

Inference

26.070.711.912.541.9$1.1 in / $4.4 out
33

o4-mini

o4-mini

multimodalvisionmulti-input reasoning
OpenAI

70.7

Inference

48.870.738.232.741.9$1.1 in / $4.4 out
34

Nova Lite

nova-lite

multimodalvisionmulti-input reasoning
AAmazon

69.9

Inference

13.669.90.00.086.4$0.06 in / $0.24 out
35

Nova Pro

nova-pro

multimodalvisionmulti-input reasoning
AAmazon

69.9

Inference

20.069.90.00.042.8$0.8 in / $3.2 out
36

Llama 3.2 3B Instruct

llama-3.2-3b-instruct

textinference
MMeta

69.0

Inference

5.369.00.00.098.8$0.01 in / $0.02 out
37

Claude 3.5 Haiku

claude-3-5-haiku-20241022

codeprogrammingtool use
Anthropic

68.7

Inference

10.968.73.07.943.1
38

Claude 3 Haiku

claude-3-haiku-20240307

multimodalvisionmulti-input reasoning
Anthropic

68.7

Inference

5.868.70.00.059.9
39

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

68.2

Inference

0.068.20.00.067.2
40

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

68.2

Inference

0.068.20.00.067.2
21

GPT-5.4 nano

OpenAI

77.4

$0.2 in / $1.25 out

22

GPT OSS 20B

OpenAI

77.3

$0.1 in / $0.5 out

23

Ministral 3 (14B Reasoning 2512)

Mistral AI

76.8

$0.2 in / $0.2 out

24

Page 2 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.2 in / $0.2 out
$2 in / $8 out
$1.25 in / $10 out
$1.25 in / $10 out
$1.75 in / $14 out
$0.8 in / $4 out
$0.25 in / $1.25 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out

GPT-4.1

OpenAI

75.4

$2 in / $8 out

25

MiniMax M2.1

MiniMax

73.9

$0.3 in / $1.2 out

26

MiniMax M2.5

MiniMax

73.9

$0.3 in / $1.2 out

27
I

Mercury 2

Inception

72.5

$0.25 in / $0.75 out

28

GPT-5.1

OpenAI

71.4

$1.25 in / $10 out

29

GPT-5.1 Instant

OpenAI

71.4

$1.25 in / $10 out

30

GPT-5.2

OpenAI

71.4

$1.75 in / $14 out

31
A

Qwen2.5 7B Instruct

Alibaba Cloud / Qwen Team

70.8

$0.3 in / $0.3 out

32

o3-mini

OpenAI

70.7

$1.1 in / $4.4 out

33

o4-mini

OpenAI

70.7

$1.1 in / $4.4 out

34
A

Nova Lite

Amazon

69.9

$0.06 in / $0.24 out

35
A

Nova Pro

Amazon

69.9

$0.8 in / $3.2 out

36
M

Llama 3.2 3B Instruct

Meta

69.0

$0.01 in / $0.02 out

37

Claude 3.5 Haiku

Anthropic

68.7

$0.8 in / $4 out

38

Claude 3 Haiku

Anthropic

68.7

$0.25 in / $1.25 out

39

Grok-4.1 Fast Non-Reasoning

xAI

68.2

$0.2 in / $0.5 out

40

Grok-4.1 Fast Reasoning

xAI

68.2

$0.2 in / $0.5 out