Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

11.4

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
61

Qwen3-235B-A22B-Thinking-2507

qwen3-235b-a22b-thinking-2507

textinference
AAlibaba Cloud / Qwen Team

26.8

Agentic

46.966.826.80.039.4$0.3 in / $3 out
62

Qwen3 VL 8B Instruct

qwen3-vl-8b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

26.7

Agentic

9.866.826.70.075.6$0.08 in / $0.5 out
63

GLM-4.5-Air

glm-4.5-air

codeprogrammingtool use
ZZhipu AI

24.9

Agentic

28.10.024.920.20.0N/A
64

Qwen3 VL 30B A3B Instruct

qwen3-vl-30b-a3b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

23.6

Agentic

28.766.823.60.063.3
65

Qwen3 VL 8B Thinking

qwen3-vl-8b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

23.5

Agentic

35.966.823.50.045.6
66

Qwen3 VL 30B A3B Thinking

qwen3-vl-30b-a3b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

21.3

Agentic

35.566.821.30.060.0
67

MiniMax M1 80K

minimax-m1-80k

codeprogrammingtool use
MiniMax

20.9

Agentic

24.684.920.919.441.7$0.55 in / $2.2 out
68

o3

o3-2025-04-16

multimodalvisionmulti-input reasoning
OpenAI

20.5

Agentic

46.238.420.530.727.7$2 in / $8 out
69

Qwen3 VL 4B Instruct

qwen3-vl-4b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

19.5

Agentic

19.766.819.50.070.6
70

Qwen3 VL 4B Thinking

qwen3-vl-4b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

18.9

Agentic

23.166.818.90.060.6
71

Sarvam-105B

sarvam-105b

codeprogrammingtool use
SSarvam AI

18.8

Agentic

43.20.018.812.40.0N/A
72

Qwen3-Next-80B-A3B-Instruct

qwen3-next-80b-a3b-instruct

textinference
AAlibaba Cloud / Qwen Team

17.9

Agentic

29.76.117.90.051.9$0.15 in / $1.5 out
73

Qwen3.6-35B-A3B

qwen3.6-35b-a3b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

17.7

Agentic

55.70.017.726.60.0N/A
74

DeepSeek-V3.2 (Thinking)

deepseek-reasoner

codeprogrammingtool use
DeepSeek

16.6

Agentic

53.10.016.645.90.0N/A
75

DeepSeek-V3.2

deepseek-v3.2

codeprogrammingtool use
DeepSeek

16.6

Agentic

58.152.516.645.970.0$0.26 in / $0.38 out
76

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

15.4

Agentic

58.068.215.40.067.2$0.2 in / $0.5 out
77

DeepSeek-V3.1

deepseek-v3.1

codeprogrammingtool use
DeepSeek

15.3

Agentic

38.740.215.328.758.9$0.27 in / $1 out
78

GPT-4o

gpt-4o-2024-08-06

multimodalvisionmulti-input reasoning
OpenAI

14.9

Agentic

31.645.914.94.426.8
79

Kimi K2 Instruct

kimi-k2-instruct

codeprogrammingtool use
Moonshot AI

14.8

Agentic

24.946.614.815.361.7$0.5 in / $0.5 out
80

GLM-4.7-Flash

glm-4.7-flash

codeprogrammingtool use
ZZhipu AI

12.0

Agentic

38.529.112.021.272.2$0.07 in / $0.4 out
61
A

Qwen3-235B-A22B-Thinking-2507

Alibaba Cloud / Qwen Team

26.8

$0.3 in / $3 out

62
A

Qwen3 VL 8B Instruct

Alibaba Cloud / Qwen Team

26.7

$0.08 in / $0.5 out

63
Z

GLM-4.5-Air

Zhipu AI

24.9

N/A

64

Page 4 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.2 in / $0.7 out
$0.18 in / $2.09 out
$0.2 in / $1 out
$0.1 in / $0.6 out
$0.1 in / $1 out
$2.5 in / $10 out
A

Qwen3 VL 30B A3B Instruct

Alibaba Cloud / Qwen Team

23.6

$0.2 in / $0.7 out

65
A

Qwen3 VL 8B Thinking

Alibaba Cloud / Qwen Team

23.5

$0.18 in / $2.09 out

66
A

Qwen3 VL 30B A3B Thinking

Alibaba Cloud / Qwen Team

21.3

$0.2 in / $1 out

67

MiniMax M1 80K

MiniMax

20.9

$0.55 in / $2.2 out

68

o3

OpenAI

20.5

$2 in / $8 out

69
A

Qwen3 VL 4B Instruct

Alibaba Cloud / Qwen Team

19.5

$0.1 in / $0.6 out

70
A

Qwen3 VL 4B Thinking

Alibaba Cloud / Qwen Team

18.9

$0.1 in / $1 out

71
S

Sarvam-105B

Sarvam AI

18.8

N/A

72
A

Qwen3-Next-80B-A3B-Instruct

Alibaba Cloud / Qwen Team

17.9

$0.15 in / $1.5 out

73
A

Qwen3.6-35B-A3B

Alibaba Cloud / Qwen Team

17.7

N/A

74

DeepSeek-V3.2 (Thinking)

DeepSeek

16.6

N/A

75

DeepSeek-V3.2

DeepSeek

16.6

$0.26 in / $0.38 out

76

Grok 4 Fast

xAI

15.4

$0.2 in / $0.5 out

77

DeepSeek-V3.1

DeepSeek

15.3

$0.27 in / $1 out

78

GPT-4o

OpenAI

14.9

$2.5 in / $10 out

79

Kimi K2 Instruct

Moonshot AI

14.8

$0.5 in / $0.5 out

80
Z

GLM-4.7-Flash

Zhipu AI

12.0

$0.07 in / $0.4 out