Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

309

Tracked models

27

Providers

264

Benchmarked

13.1

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

309 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
1

Nemotron 3 Nano (30B A3B)

nemotron-3-nano-30b-a3b

codeprogrammingtool use
NNVIDIA

100.0

Value / Price

44.541.13.04.0100.0$0.06 in / $0.24 out
2

DeepSeek-V4-Flash-Max

deepseek-v4-flash-max

codeprogrammingtool use
DeepSeek

98.7

Value / Price

58.389.247.644.298.7
3

LongCat-Flash-Lite

longcat-flash-lite

codeprogrammingtool use
Meituan

96.5

Value / Price

23.674.730.124.596.5
4

GPT-4.1 nano

gpt-4.1-nano-2025-04-14

multimodalvisionmulti-input reasoning
OpenAI

95.9

Value / Price

12.290.80.00.095.9
5

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use
SStepFun

95.0

Value / Price

62.860.442.050.695.0$0.1 in / $0.4 out
6

Gemma 4 26B-A4B

gemma-4-26b-a4b-it

multimodalvisionmulti-input reasoning
Google

93.7

Value / Price

42.341.10.00.093.7
7

Gemma 4 31B

gemma-4-31b-it

multimodalvisionmulti-input reasoning
Google

90.5

Value / Price

54.941.10.00.090.5
8

GPT OSS 120B

gpt-oss-120b

textinference
OpenAI

90.5

Value / Price

34.914.626.80.090.5$0.09 in / $0.45 out
9

Qwen3 VL 8B Instruct

qwen3-vl-8b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

87.3

Value / Price

8.841.126.40.087.3
10

GPT OSS 120B High

gpt-oss-120b-high

multimodalvisionmulti-input reasoning
OpenAI

83.3

Value / Price

44.253.00.00.083.3
11

Qwen3 VL 4B Instruct

qwen3-vl-4b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

81.0

Value / Price

18.941.118.80.081.0
12

Mercury 2

mercury-2

codeprogrammingtool use
IInception

79.7

Value / Price

43.469.00.020.379.7$0.25 in / $0.75 out
13

Qwen3 30B A3B

qwen3-30b-a3b

textinference
AAlibaba Cloud / Qwen Team

79.5

Value / Price

24.626.00.00.079.5$0.1 in / $0.44 out
14

DeepSeek-V3.2 (Non-thinking)

deepseek-chat

textinference
DeepSeek

79.3

Value / Price

0.053.00.00.079.3$0.28 in / $0.42 out
15

Mistral Small 4

mistral-small-latest

multimodalvisionmulti-input reasoning
Mistral AI

75.9

Value / Price

32.928.50.00.075.9
16

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

73.7

Value / Price

0.062.10.00.073.7
17

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

73.7

Value / Price

0.062.10.00.073.7
18

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

73.7

Value / Price

57.162.113.70.073.7$0.2 in / $0.5 out
19

Grok-4 Fast Non-Reasoning

grok-4-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

73.7

Value / Price

0.062.10.00.073.7
20

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

73.7

Value / Price

0.062.10.00.073.7
1
N

Nemotron 3 Nano (30B A3B)

NVIDIA

100.0

$0.06 in / $0.24 out

2

DeepSeek-V4-Flash-Max

DeepSeek

98.7

$0.14 in / $0.28 out

3

LongCat-Flash-Lite

Meituan

96.5

$0.1 in / $0.4 out

Page 1 of 16 · 309 models

Next

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.14 in / $0.28 out
$0.1 in / $0.4 out
$0.1 in / $0.4 out
$0.13 in / $0.4 out
$0.14 in / $0.4 out
$0.08 in / $0.5 out
$0.1 in / $0.5 out
$0.1 in / $0.6 out
$0.15 in / $0.6 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
4

GPT-4.1 nano

OpenAI

95.9

$0.1 in / $0.4 out

5
S

Step-3.5-Flash

StepFun

95.0

$0.1 in / $0.4 out

6

Gemma 4 26B-A4B

Google

93.7

$0.13 in / $0.4 out

7

Gemma 4 31B

Google

90.5

$0.14 in / $0.4 out

8

GPT OSS 120B

OpenAI

90.5

$0.09 in / $0.45 out

9
A

Qwen3 VL 8B Instruct

Alibaba Cloud / Qwen Team

87.3

$0.08 in / $0.5 out

10

GPT OSS 120B High

OpenAI

83.3

$0.1 in / $0.5 out

11
A

Qwen3 VL 4B Instruct

Alibaba Cloud / Qwen Team

81.0

$0.1 in / $0.6 out

12
I

Mercury 2

Inception

79.7

$0.25 in / $0.75 out

13
A

Qwen3 30B A3B

Alibaba Cloud / Qwen Team

79.5

$0.1 in / $0.44 out

14

DeepSeek-V3.2 (Non-thinking)

DeepSeek

79.3

$0.28 in / $0.42 out

15

Mistral Small 4

Mistral AI

75.9

$0.15 in / $0.6 out

16

Grok-4.1 Fast Non-Reasoning

xAI

73.7

$0.2 in / $0.5 out

17

Grok-4.1 Fast Reasoning

xAI

73.7

$0.2 in / $0.5 out

18

Grok 4 Fast

xAI

73.7

$0.2 in / $0.5 out

19

Grok-4 Fast Non-Reasoning

xAI

73.7

$0.2 in / $0.5 out

20

Grok-4 Fast Reasoning

xAI

73.7

$0.2 in / $0.5 out