Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

30.7

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
21

Gemma 3 4B

gemma-3-4b-it

multimodalvisionmulti-input reasoning
Google

81.8

Value / Price

4.619.90.00.081.8$0.02 in / $0.04 out
22

Qwen2.5-Coder 32B Instruct

qwen-2.5-coder-32b-instruct

textinference
AAlibaba Cloud / Qwen Team

81.2

Value / Price

0.020.90.00.081.2$0.09 in / $0.09 out
23

Gemma 3 12B

gemma-3-12b-it

multimodalvisionmulti-input reasoning
Google

80.5

Value / Price

9.319.90.00.080.5
24

Mistral Small 3 24B Instruct

mistral-small-24b-instruct-2501

textinference
Mistral AI

80.5

Value / Price

14.420.70.00.080.5$0.07 in / $0.14 out
25

Gemini 2.0 Flash-Lite

gemini-2.0-flash-lite

multimodalvisionmulti-input reasoning
Google

79.7

Value / Price

25.763.20.00.079.7
26

Phi-4-multimodal-instruct

phi-4-multimodal-instruct

multimodalvisionmulti-input reasoning
MMicrosoft

79.7

Value / Price

8.811.80.00.079.7
27

DeepSeek-V2.5

deepseek-v2.5

codeprogrammingtool use
DeepSeek

79.5

Value / Price

0.045.60.00.979.5$0.14 in / $0.28 out
28

GPT OSS 20B

gpt-oss-20b

textinference
OpenAI

79.3

Value / Price

26.177.36.00.079.3$0.1 in / $0.5 out
29

Gemma 4 26B-A4B

gemma-4-26b-a4b-it

multimodalvisionmulti-input reasoning
Google

77.8

Value / Price

43.766.80.00.077.8
30

Qwen2.5 7B Instruct

qwen-2.5-7b-instruct

textinference
AAlibaba Cloud / Qwen Team

77.3

Value / Price

7.570.80.00.077.3$0.3 in / $0.3 out
31

Mistral NeMo Instruct

mistral-nemo-instruct-2407

textinference
Mistral AI

77.0

Value / Price

0.020.90.00.077.0$0.15 in / $0.15 out
32

Phi-3.5-mini-instruct

phi-3.5-mini-instruct

multimodalvisionmulti-input reasoning
MMicrosoft

77.0

Value / Price

2.710.30.00.077.0$0.1 in / $0.1 out
33

Phi 4

phi-4

textinference
MMicrosoft

76.9

Value / Price

15.88.50.00.076.9$0.07 in / $0.14 out
34

Gemma 4 31B

gemma-4-31b-it

multimodalvisionmulti-input reasoning
Google

76.7

Value / Price

56.566.80.00.076.7
35

GPT OSS 120B

gpt-oss-120b

textinference
OpenAI

76.7

Value / Price

36.634.926.80.076.7$0.09 in / $0.45 out
36

Qwen3 30B A3B

qwen3-30b-a3b

textinference
AAlibaba Cloud / Qwen Team

76.6

Value / Price

25.836.40.00.076.6$0.1 in / $0.3 out
37

Ministral 8B Instruct

ministral-8b-instruct-2410

textinference
Mistral AI

76.0

Value / Price

0.07.10.00.076.0$0.1 in / $0.1 out
38

DeepSeek R1 Distill Qwen 32B

deepseek-r1-distill-qwen-32b

textinference
DeepSeek

75.6

Value / Price

26.816.10.00.075.6$0.12 in / $0.18 out
39

Qwen3 VL 8B Instruct

qwen3-vl-8b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

75.6

Value / Price

9.866.826.70.075.6
40

Gemma 3 27B

gemma-3-27b-it

multimodalvisionmulti-input reasoning
Google

73.6

Value / Price

11.019.90.00.073.6
21

Gemma 3 4B

Google

81.8

$0.02 in / $0.04 out

22
A

Qwen2.5-Coder 32B Instruct

Alibaba Cloud / Qwen Team

81.2

$0.09 in / $0.09 out

23

Gemma 3 12B

Google

80.5

$0.05 in / $0.1 out

24

Page 2 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.05 in / $0.1 out
$0.07 in / $0.3 out
$0.05 in / $0.1 out
$0.13 in / $0.4 out
$0.14 in / $0.4 out
$0.08 in / $0.5 out
$0.1 in / $0.2 out

Mistral Small 3 24B Instruct

Mistral AI

80.5

$0.07 in / $0.14 out

25

Gemini 2.0 Flash-Lite

Google

79.7

$0.07 in / $0.3 out

26
M

Phi-4-multimodal-instruct

Microsoft

79.7

$0.05 in / $0.1 out

27

DeepSeek-V2.5

DeepSeek

79.5

$0.14 in / $0.28 out

28

GPT OSS 20B

OpenAI

79.3

$0.1 in / $0.5 out

29

Gemma 4 26B-A4B

Google

77.8

$0.13 in / $0.4 out

30
A

Qwen2.5 7B Instruct

Alibaba Cloud / Qwen Team

77.3

$0.3 in / $0.3 out

31

Mistral NeMo Instruct

Mistral AI

77.0

$0.15 in / $0.15 out

32
M

Phi-3.5-mini-instruct

Microsoft

77.0

$0.1 in / $0.1 out

33
M

Phi 4

Microsoft

76.9

$0.07 in / $0.14 out

34

Gemma 4 31B

Google

76.7

$0.14 in / $0.4 out

35

GPT OSS 120B

OpenAI

76.7

$0.09 in / $0.45 out

36
A

Qwen3 30B A3B

Alibaba Cloud / Qwen Team

76.6

$0.1 in / $0.3 out

37

Ministral 8B Instruct

Mistral AI

76.0

$0.1 in / $0.1 out

38

DeepSeek R1 Distill Qwen 32B

DeepSeek

75.6

$0.12 in / $0.18 out

39
A

Qwen3 VL 8B Instruct

Alibaba Cloud / Qwen Team

75.6

$0.08 in / $0.5 out

40

Gemma 3 27B

Google

73.6

$0.1 in / $0.2 out