Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

31.8

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
61

Qwen3 VL 30B A3B Instruct

qwen3-vl-30b-a3b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

28.766.823.60.063.3$0.2 in / $0.7 out
62

Qwen3 VL 30B A3B Thinking

qwen3-vl-30b-a3b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

35.566.821.30.060.0
63

Qwen3 VL 4B Instruct

qwen3-vl-4b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

19.766.819.50.070.6
64

Qwen3 VL 4B Thinking

qwen3-vl-4b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

23.166.818.90.060.6
65

Qwen3 VL 8B Instruct

qwen3-vl-8b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

9.866.826.70.075.6
66

Qwen3 VL 8B Thinking

qwen3-vl-8b-thinking

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

66.8

Inference

35.966.823.50.045.6
67

Gemini 1.5 Pro

gemini-1.5-pro

multimodalvisionmulti-input reasoning
Google

65.5

Inference

27.865.50.00.024.6
68

Jamba 1.5 Mini

jamba-1.5-mini

textinference
AAI21 Labs

65.2

Inference

4.865.20.00.072.4$0.2 in / $0.4 out
69

Devstral Medium

devstral-medium-2507

codeprogrammingtool use
Mistral AI

64.5

Inference

0.064.50.024.753.2$0.4 in / $2 out
70

Devstral Small 1.1

devstral-small-2507

codeprogrammingtool use
Mistral AI

64.5

Inference

0.064.50.015.085.0
71

Mistral Small 3.1 24B Base

mistral-small-3.1-24b-base-2503

multimodalvisionmulti-input reasoning
Mistral AI

64.5

Inference

13.564.50.00.085.0
72

Grok-4.1

grok-4.1-2025-11-17

multimodalvisionmulti-input reasoning
xAI

64.2

Inference

0.064.20.00.022.6
73

ChatGPT-4o Latest

chatgpt-4o-latest

multimodalvisionmulti-input reasoning
OpenAI

63.5

Inference

56.663.50.00.032.0
74

Gemini 2.0 Flash-Lite

gemini-2.0-flash-lite

multimodalvisionmulti-input reasoning
Google

63.2

Inference

25.763.20.00.079.7
75

Gemini 2.5 Flash

gemini-2.5-flash

multimodalvisionmulti-input reasoning
Google

63.2

Inference

40.163.20.023.442.6
76

Gemini 2.5 Pro

gemini-2.5-pro

multimodalvisionmulti-input reasoning
Google

63.2

Inference

44.663.20.025.627.9
77

Gemini 2.5 Pro Preview 06-05

gemini-2.5-pro-preview-06-05

multimodalvisionmulti-input reasoning
Google

63.2

Inference

51.763.20.030.027.9
78

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use
SStepFun

63.2

Inference

62.363.245.353.082.1$0.1 in / $0.4 out
79

GPT-5.1 Medium

gpt-5.1-medium-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

61.6

Inference

63.661.60.00.029.0
80

GPT-5 Medium

gpt-5-medium-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

61.6

Inference

56.961.60.00.029.0
61
A

Qwen3 VL 30B A3B Instruct

Alibaba Cloud / Qwen Team

66.8

$0.2 in / $0.7 out

62
A

Qwen3 VL 30B A3B Thinking

Alibaba Cloud / Qwen Team

66.8

$0.2 in / $1 out

63
A

Qwen3 VL 4B Instruct

Alibaba Cloud / Qwen Team

66.8

$0.1 in / $0.6 out

64

Page 4 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$0.2 in / $1 out
$0.1 in / $0.6 out
$0.1 in / $1 out
$0.08 in / $0.5 out
$0.18 in / $2.09 out
$2.5 in / $10 out
$0.1 in / $0.3 out
$0.1 in / $0.3 out
$3 in / $15 out
$2.5 in / $10 out
$0.07 in / $0.3 out
$0.3 in / $2.5 out
$1.25 in / $10 out
$1.25 in / $10 out
$1.25 in / $10 out
$1.25 in / $10 out
A

Qwen3 VL 4B Thinking

Alibaba Cloud / Qwen Team

66.8

$0.1 in / $1 out

65
A

Qwen3 VL 8B Instruct

Alibaba Cloud / Qwen Team

66.8

$0.08 in / $0.5 out

66
A

Qwen3 VL 8B Thinking

Alibaba Cloud / Qwen Team

66.8

$0.18 in / $2.09 out

67

Gemini 1.5 Pro

Google

65.5

$2.5 in / $10 out

68
A

Jamba 1.5 Mini

AI21 Labs

65.2

$0.2 in / $0.4 out

69

Devstral Medium

Mistral AI

64.5

$0.4 in / $2 out

70

Devstral Small 1.1

Mistral AI

64.5

$0.1 in / $0.3 out

71

Mistral Small 3.1 24B Base

Mistral AI

64.5

$0.1 in / $0.3 out

72

Grok-4.1

xAI

64.2

$3 in / $15 out

73

ChatGPT-4o Latest

OpenAI

63.5

$2.5 in / $10 out

74

Gemini 2.0 Flash-Lite

Google

63.2

$0.07 in / $0.3 out

75

Gemini 2.5 Flash

Google

63.2

$0.3 in / $2.5 out

76

Gemini 2.5 Pro

Google

63.2

$1.25 in / $10 out

77

Gemini 2.5 Pro Preview 06-05

Google

63.2

$1.25 in / $10 out

78
S

Step-3.5-Flash

StepFun

63.2

$0.1 in / $0.4 out

79

GPT-5.1 Medium

OpenAI

61.6

$1.25 in / $10 out

80

GPT-5 Medium

OpenAI

61.6

$1.25 in / $10 out