Skytells
HomeModelsCLIChangelog
  • Home
  • Models
  • CLI
  • Changelog
Skytells

Addressing the world's greatest challenges with AI. Enterprise research, foundation models, and infrastructure trusted by organizations worldwide since 2012.

Get Started

  • Console
  • Learn
  • Documentation
  • API Reference
  • Pricing
  • ModelsNew

Platform

  • Cloud AgentsNew
  • AI Solutions
  • Infrastructure
  • Edge Network
  • Trust Center
  • CLI

Resources

  • Blog
  • Changelog
  • AI Leaderboard
  • Research
  • Status

Company

  • About
  • Careers
  • Legal
  • Privacy Policy

© 2012–2026 Skytells, Inc. All rights reserved.

Live rankings

AI Model Leaderboard

Every major AI model ranked across benchmark quality, inference speed, agentic capability, programming aptitude, and cost efficiency — updated continuously from published evaluation data.

Explore full leaderboardBrowse model catalog

294

Tracked models

27

Providers

251

Benchmarked

34.7

Avg. index

OverallBenchmarksInferenceAgenticProgrammingValue / Price

294 models

RankModelProviderScoreBenchmarksInferenceAgenticProgrammingValuePrice
61

Gemini 1.5 Flash

gemini-1.5-flash

multimodalvisionmulti-input reasoning
Google

52.6

overall

23.292.10.00.071.9$0.15 in / $0.6 out
62

Grok-4

grok-4

multimodalvisionmulti-input reasoning
xAI

52.2

overall

52.20.00.00.00.0N/A
63

Claude Sonnet 4.6

claude-sonnet-4-6

multimodalvisionmulti-input reasoning
Anthropic

51.7

overall

66.130.149.668.913.2
64

MiMo-V2-Flash

mimo-v2-flash

codeprogrammingtool use
Xiaomi

51.5

overall

53.779.827.239.385.9$0.1 in / $0.3 out
65

Qwen3 VL 235B A22B Instruct

qwen3-vl-235b-a22b-instruct

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

50.8

overall

37.166.856.70.049.4
66

GPT-5.1 Codex High

gpt-5.1-codex-high

multimodalvisionmulti-input reasoning
OpenAI

50.7

overall

61.048.60.00.025.1
67

Grok-3

grok-3

multimodalvisionmulti-input reasoning
xAI

50.4

overall

59.551.90.00.022.6$3 in / $15 out
68

Kimi K2 0905

kimi-k2-0905

textinference
Moonshot AI

50.2

overall

44.466.80.00.040.0$0.6 in / $2.5 out
69

Qwen3.5-35B-A3B

qwen3.5-35b-a3b

multimodalvisionmulti-input reasoning
AAlibaba Cloud / Qwen Team

49.5

overall

57.266.844.334.446.4$0.25 in / $2 out
70

GPT-5

gpt-5-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

49.1

overall

64.40.029.051.70.0N/A
71

Gemini 1.5 Flash 8B

gemini-1.5-flash-8b

multimodalvisionmulti-input reasoning
Google

49.1

overall

10.492.10.00.088.3
72

Claude Opus 4.1

claude-opus-4-1-20250805

multimodalvisionmulti-input reasoning
Anthropic

48.8

overall

48.130.166.862.97.0
73

Claude Opus 4.5

claude-opus-4-5-20251101

multimodalvisionmulti-input reasoning
Anthropic

48.6

overall

56.330.144.274.210.6
74

GPT-5 mini

gpt-5-mini-2025-08-07

multimodalvisionmulti-input reasoning
OpenAI

48.6

overall

41.989.70.023.756.3
75

Claude Haiku 4.5

claude-haiku-4-5-20251001

multimodalvisionmulti-input reasoning
Anthropic

48.4

overall

32.961.254.257.237.7
76

GPT-5.2 Pro

gpt-5.2-pro-2025-12-11

multimodalvisionmulti-input reasoning
OpenAI

48.2

overall

67.331.356.40.02.5
77

Grok 4 Fast

grok-4-fast

multimodalvisionmulti-input reasoning
xAI

48.2

overall

58.068.215.40.067.2$0.2 in / $0.5 out
78

Grok-4.1

grok-4.1-2025-11-17

multimodalvisionmulti-input reasoning
xAI

48.2

overall

0.064.20.00.022.6$3 in / $15 out
79

Claude Opus 4

claude-opus-4-20250514

multimodalvisionmulti-input reasoning
Anthropic

47.8

overall

37.80.057.949.50.0
80

o1-pro

o1-pro

multimodalvisionmulti-input reasoning
OpenAI

47.5

overall

47.50.00.00.00.0N/A
61

Gemini 1.5 Flash

Google

52.6

$0.15 in / $0.6 out

62

Grok-4

xAI

52.2

N/A

63

Claude Sonnet 4.6

Anthropic

51.7

$3 in / $15 out

64

Page 4 of 15 · 294 models

PreviousNext

Want benchmark charts, model comparison, and pricing analytics?

Sign in to access the full interactive leaderboard with deep benchmark breakdowns and model comparison tools.

Open full leaderboard

Rankings are based on multi-dimensional evaluation across benchmark quality, inference efficiency, and cost-per-output. Scores are updated continuously and may differ from individual third-party benchmarks.

$3 in / $15 out
$0.3 in / $1.5 out
$1.25 in / $10 out
$0.07 in / $0.3 out
$15 in / $75 out
$5 in / $25 out
$0.25 in / $2 out
$1 in / $5 out
$21 in / $168 out
N/A

MiMo-V2-Flash

Xiaomi

51.5

$0.1 in / $0.3 out

65
A

Qwen3 VL 235B A22B Instruct

Alibaba Cloud / Qwen Team

50.8

$0.3 in / $1.5 out

66

GPT-5.1 Codex High

OpenAI

50.7

$1.25 in / $10 out

67

Grok-3

xAI

50.4

$3 in / $15 out

68

Kimi K2 0905

Moonshot AI

50.2

$0.6 in / $2.5 out

69
A

Qwen3.5-35B-A3B

Alibaba Cloud / Qwen Team

49.5

$0.25 in / $2 out

70

GPT-5

OpenAI

49.1

N/A

71

Gemini 1.5 Flash 8B

Google

49.1

$0.07 in / $0.3 out

72

Claude Opus 4.1

Anthropic

48.8

$15 in / $75 out

73

Claude Opus 4.5

Anthropic

48.6

$5 in / $25 out

74

GPT-5 mini

OpenAI

48.6

$0.25 in / $2 out

75

Claude Haiku 4.5

Anthropic

48.4

$1 in / $5 out

76

GPT-5.2 Pro

OpenAI

48.2

$21 in / $168 out

77

Grok 4 Fast

xAI

48.2

$0.2 in / $0.5 out

78

Grok-4.1

xAI

48.2

$3 in / $15 out

79

Claude Opus 4

Anthropic

47.8

N/A

80

o1-pro

OpenAI

47.5

N/A