Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

309

Tracked models

27

Providers

264

Benchmarked

13.1

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

309 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
21

GPT-5.1

gpt-5.1-2025-11-13

multimodalvisionmulti-input reasoning
OpenAI

66.9

Inference

65.466.90.055.733.2$1.25 in / $10 out
22

GPT-5.1 Instant

gpt-5.1-instant-2025-11-12

multimodalvisionmulti-input reasoning
OpenAI

66.9

Inference

65.466.90.055.733.2
23

GPT-5.2

gpt-5.2-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

66.9

Inference

75.366.944.470.727.1
24

GPT-5.5 Instant

gpt-5.5-instant

multimodalvisionmulti-input reasoning
OpenAI

66.9

Inference

53.266.90.00.016.0
25

Grok-4.1 Fast Non-Reasoning

grok-4-1-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

62.1

Inference

0.062.10.00.073.7
26

Grok-4.1 Fast Reasoning

grok-4-1-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

62.1

Inference

0.062.10.00.073.7
27

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

62.1

Inference

57.162.113.70.073.7$0.2 in / $0.5 out
28

Grok-4 Fast Non-Reasoning

grok-4-fast-non-reasoning

multimodalvisionmulti-input reasoning
xAI

62.1

Inference

0.062.10.00.073.7
29

Grok-4 Fast Reasoning

grok-4-fast-reasoning

multimodalvisionmulti-input reasoning
xAI

62.1

Inference

0.062.10.00.073.7
30

Step-3.5-Flash

step-3.5-flash

codeprogrammingtool use
SStepFun

60.4

Inference

62.860.442.050.695.0$0.1 in / $0.4 out
31

Gemini 3.1 Pro

gemini-3.1-pro-preview

multimodalvisionmulti-input reasoning
Google

59.4

Inference

73.859.468.966.018.5
32

GPT-5.4 Mini

gpt-5.4-mini

texttext-to-textlanguage
OpenAI

56.3

Inference

56.956.322.427.432.9
33

GPT-5.4 nano

gpt-5.4-nano

multimodalvisionmulti-input reasoning
OpenAI

56.3

Inference

46.056.310.410.770.9$0.2 in / $1.25 out
34

Claude Haiku 4.5

claude-haiku-4-5-20251001

multimodalvisionmulti-input reasoning
Anthropic

55.3

Inference

31.555.353.354.938.7
35

DeepSeek-V3.2 (Non-thinking)

deepseek-chat

textinference
DeepSeek

53.0

Inference

0.053.00.00.079.3$0.28 in / $0.42 out
36

GPT OSS 120B High

gpt-oss-120b-high

multimodalvisionmulti-input reasoning
OpenAI

53.0

Inference

44.253.00.00.083.3
37

Gemini 2.5 Flash

gemini-2.5-flash

multimodalvisionmulti-input reasoning
Google

51.2

Inference

38.951.20.021.046.4
38

Gemini 2.5 Pro

gemini-2.5-pro

multimodalvisionmulti-input reasoning
Google

51.2

Inference

43.451.20.022.925.2
39

GPT-4o

gpt-4o-2024-05-13

multimodalvisionmulti-input reasoning
OpenAI

50.5

Inference

21.650.50.00.030.1
40

GPT-5.3 Chat

gpt-5.3-chat-latest

multimodalvisionmulti-input reasoning
OpenAI

50.5

Inference

0.050.50.00.027.1
21

GPT-5.1

OpenAI

66.9

$1.25 in / $10 out

22

GPT-5.1 Instant

OpenAI

66.9

$1.25 in / $10 out

23

GPT-5.2

OpenAI

66.9

$1.75 in / $14 out

24

Page 2 of 16 · 309 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$1.25 in / $10 out
$1.75 in / $14 out
$5 in / $30 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$0.2 in / $0.5 out
$2.5 in / $15 out
$0.75 in / $4.5 out
$1 in / $5 out
$0.1 in / $0.5 out
$0.3 in / $2.5 out
$1.25 in / $10 out
$2.5 in / $10 out
$1.75 in / $14 out

GPT-5.5 Instant

OpenAI

66.9

$5 in / $30 out

25

Grok-4.1 Fast Non-Reasoning

xAI

62.1

$0.2 in / $0.5 out

26

Grok-4.1 Fast Reasoning

xAI

62.1

$0.2 in / $0.5 out

27

Grok 4 Fast

xAI

62.1

$0.2 in / $0.5 out

28

Grok-4 Fast Non-Reasoning

xAI

62.1

$0.2 in / $0.5 out

29

Grok-4 Fast Reasoning

xAI

62.1

$0.2 in / $0.5 out

30
S

Step-3.5-Flash

StepFun

60.4

$0.1 in / $0.4 out

31

Gemini 3.1 Pro

Google

59.4

$2.5 in / $15 out

32

GPT-5.4 Mini

OpenAI

56.3

$0.75 in / $4.5 out

33

GPT-5.4 nano

OpenAI

56.3

$0.2 in / $1.25 out

34

Claude Haiku 4.5

Anthropic

55.3

$1 in / $5 out

35

DeepSeek-V3.2 (Non-thinking)

DeepSeek

53.0

$0.28 in / $0.42 out

36

GPT OSS 120B High

OpenAI

53.0

$0.1 in / $0.5 out

37

Gemini 2.5 Flash

Google

51.2

$0.3 in / $2.5 out

38

Gemini 2.5 Pro

Google

51.2

$1.25 in / $10 out

39

GPT-4o

OpenAI

50.5

$2.5 in / $10 out

40

GPT-5.3 Chat

OpenAI

50.5

$1.75 in / $14 out