State of LLM API Pricing — May 2026

TL;DR

  • Cheapest overall: Gemini 2.0 Flash at $0.10/$0.40 per 1M tokens
  • Best value premium: Gemini 2.5 Pro at $1.25/$10 — 12x cheaper than Claude 4 Opus
  • Biggest surprise: DeepSeek V4 Pro at $0.44/$0.87 — near-budget pricing with premium capabilities
  • Biggest price hike: Grok 3 jumped 10x to $30/$150 per 1M tokens
  • Price range: 675x difference between cheapest and most expensive models

The LLM Pricing Landscape in May 2026

The AI API market has never been more competitive — or more confusing. With 33 models across 10 providers, developers face a dizzying array of pricing structures, context windows, and capability tiers. This report cuts through the noise with hard numbers.

We track every major LLM API provider and verify pricing against official documentation. Here's what the data tells us about the state of AI API pricing in May 2026.

Key Finding

The gap between the cheapest and most expensive model is 675x. Choosing the wrong model for your use case could cost you hundreds of dollars per month unnecessarily. The average developer can save 40%+ by switching to a better-fit model.

Complete Pricing Table — All 33 Models

Sorted by total cost (input + output) per 1M tokens. Cheapest models at the top.

# Provider Model Input / 1M Output / 1M Context Tier
1 Google Gemini 2.0 Flash $0.10 $0.40 1M Budget
2 Meta Llama 3.1 8B $0.10 $0.10 128K Budget
3 Meta Llama 4 Scout $0.11 $0.34 10M Budget
4 DeepSeek V4 Flash $0.14 $0.28 1M Budget
5 OpenAI GPT-4o mini $0.15 $0.60 128K Budget
6 OpenAI GPT-oss 120B $0.15 $0.60 128K Budget
7 Mistral Small 4 $0.15 $0.60 128K Budget
8 Meta Llama 4 Maverick $0.20 $0.60 10M Budget
9 OpenAI GPT-oss 20B $0.08 $0.35 128K Budget
10 DeepSeek V3 $0.27 $1.10 128K Budget
11 DeepSeek V4 Pro $0.44 $0.87 1M Budget
12 Mistral Large 3 $0.50 $1.50 128K Budget
13 Cohere Command R $0.50 $1.50 128K Budget
14 Meta Llama 3.1 70B $0.88 $0.88 128K Mid
15 Moonshot Kimi K2.6 $0.90 $3.75 256K Budget
16 Anthropic Claude Haiku 4.5 $1.00 $5.00 200K Budget
17 Google Gemini 2.5 Pro $1.25 $10.00 1M Mid
18 OpenAI GPT-5.3 Codex $1.75 $14.00 400K Mid
19 AI21 Jamba 1.5 Large $2.00 $8.00 256K Mid
20 Google Gemini 3.1 Pro $2.00 $12.00 1M Mid
21 OpenAI GPT-4o $2.50 $10.00 128K Mid
22 Cohere Command R+ $2.50 $10.00 128K Mid
23 Anthropic Claude Sonnet 4 $3.00 $15.00 200K Mid
24 Anthropic Claude Sonnet 4.6 $3.00 $15.00 1M Mid
25 xAI Grok 3 Mini $3.00 $5.00 128K Mid
26 OpenAI GPT-5 $1.25 $10.00 272K Premium
27 OpenAI GPT-5.5 $5.00 $30.00 1M Premium
28 Anthropic Claude Opus 4.7 $5.00 $25.00 1M Premium
29 OpenAI GPT-5 mini $0.40 $1.60 256K Budget
30 Anthropic Claude 4 Opus $15.00 $75.00 200K Premium
31 xAI Grok 3 $30.00 $150.00 128K Premium
32 OpenAI GPT-5.5 Pro $30.00 $180.00 1M Premium

Use the Calculator

These per-token prices don't tell the full story. Your actual monthly cost depends on your token counts and request volume. Use our free calculator to see what you'd actually pay across all 33 models.

Provider Breakdown

Google — Best Budget Options

Google dominates the budget tier with Gemini 2.0 Flash at $0.10/$0.40 — the cheapest model from any major provider. Their 1M token context window is the largest in the budget tier. Gemini 2.5 Pro ($1.25/$10) offers the best value in the mid-tier, with capabilities comparable to models costing 10x more.

DeepSeek — The Disruptor

DeepSeek V4 Pro at $0.44/$0.87 is the pricing story of 2026. This model delivers near-premium capabilities at budget prices, with a 1M context window. It's 75% cheaper than GPT-4o for similar tasks. DeepSeek V4 Flash ($0.14/$0.28) is even cheaper for simpler workloads.

OpenAI — Premium Positioning

OpenAI has firmly moved upmarket. GPT-5 ($1.25/$10) and GPT-5.5 ($5/$30) are premium-priced. However, their GPT-4o mini ($0.15/$0.60) remains competitive in the budget tier, and the new GPT-oss models ($0.08-$0.15) offer open-source alternatives at rock-bottom prices.

Anthropic — Quality at a Price

Claude 4 Opus ($15/$75) is the second most expensive model, justified by its reasoning capabilities. The sweet spot is Claude Haiku 4.5 ($1.00/$5.00) — recently updated from 3.5, offering better quality at budget-tier pricing. Claude Sonnet 4 ($3/$15) is the popular mid-range choice.

xAI — Caution: Price Hike

Grok 3 underwent a 10x price increase in May 2026, jumping to $30/$150 per 1M tokens. This makes it the second most expensive model after GPT-5.5 Pro. Grok 3 Mini ($3/$5) remains reasonable for those who need xAI's ecosystem.

Mistral — Quietly Competitive

Mistral Large 3 dropped 75% to $0.50/$1.50, making it one of the best mid-tier values. Mistral Small 4 ($0.15/$0.60) matches GPT-4o mini pricing with a European data sovereignty angle.

Tier Analysis: Which Model for Your Use Case?

Budget Tier ($0.08 - $1.00/1M input)

For high-volume applications like chatbots, content generation, and data extraction, the budget tier offers the best value:

Best Overall
Gemini 2.0 Flash
$0.10
per 1M input tokens
Best Value
DeepSeek V4 Pro
$0.44
premium quality, budget price
Most Popular
GPT-4o mini
$0.15
reliable, well-documented

Mid Tier ($1.00 - $5.00/1M input)

For applications requiring strong reasoning, code generation, or complex analysis:

Best Value
Gemini 2.5 Pro
$1.25
1M context, strong reasoning
Most Capable
Claude Sonnet 4
$3.00
excellent for code & analysis
Newest
Gemini 3.1 Pro
$2.00
1M context, latest model

Premium Tier ($5.00+/1M input)

For tasks requiring the highest quality reasoning, creative writing, or complex problem-solving:

Best Reasoning
Claude Opus 4.7
$5.00
1M context, top reasoning
Most Expensive
GPT-5.5 Pro
$30.00
for the most complex tasks

5 Biggest Pricing Changes in May 2026

  1. Grok 3: 10x price increase — From $3/$15 to $30/$150. The biggest single-model price hike we've tracked.
  2. DeepSeek V4 Pro: 75% discount — Now $0.44/$0.87, making it the best value in the market for capable models.
  3. Mistral Large 3: 75% drop — From $2/$6 to $0.50/$1.50. A major repositioning toward budget pricing.
  4. Claude Haiku 3.5 → 4.5 — Updated model with better capabilities at $1.00/$5.00 (was $0.80/$4.00).
  5. Gemini 3 Pro → 3.1 Pro — Renamed and repriced at $2.00/$12.00 per 1M tokens.

Cost Optimization Strategies

Based on our analysis of 33 models, here are the top ways to reduce your AI API costs:

Strategy 1: Route by Complexity

Use budget models (GPT-4o mini, Gemini Flash) for 80% of requests that are simple. Reserve premium models (Claude Opus, GPT-5) for the 20% that need advanced reasoning. This alone can cut costs by 60-80%.

Strategy 2: Consider DeepSeek

DeepSeek V4 Pro at $0.44/$0.87 delivers capabilities comparable to GPT-4o ($2.50/$10) at 82% lower cost. If your workload doesn't require OpenAI's ecosystem, this is the biggest savings opportunity.

Strategy 3: Optimize Token Usage

Reduce input tokens by truncating system prompts and context. Reduce output tokens by setting max_tokens limits. A 30% reduction in token usage = 30% cost savings across all models.

Watch Out: Grok 3

If you're using Grok 3, review your costs immediately. The 10x price increase means a workload that cost $50/month now costs $500/month. Consider switching to Grok 3 Mini ($3/$5) or an alternative provider.

Methodology

All pricing data in this report is sourced from official provider documentation and verified against published pricing pages. Data was last verified on May 3, 2026. Prices shown are per 1M tokens for standard API access (not batch, not fine-tuning, not enterprise agreements).

Context windows reflect the maximum advertised context length. Actual performance may vary at maximum context. "Budget," "Mid," and "Premium" tiers are our classifications based on pricing and capabilities.

Calculate Your Exact Costs

Enter your token counts and request volume to see what you'd pay across all 33 models. Free, instant, no signup.

Open Cost Calculator

Subscribe for Monthly Updates

This report is updated monthly as pricing changes. Subscribe to get notified when we publish the next edition, or when any provider changes their pricing.

Related Reading

Want to optimize your AI API costs?

APIpulse Pro ($29 one-time) includes saved scenarios, cost report exports, and personalized recommendations that can save you up to 40%.

Get Pro — $29