📅 May 28, 2026 ⏱️ 8 min read 🏷️ Pricing Report

AI API Pricing Report: May 2026

34 models. 10 providers. Prices from $0.075/M to $180/M. Here's the complete state of AI API pricing — and where the deals are.

AI API pricing in 2026 looks nothing like it did a year ago. Budget models now cost less than $0.10 per million tokens. Premium models have halved in price. And the number of available models has exploded to 34 across 10 providers.

This report covers every major AI API model's current pricing, the trends driving prices down, the best deals in each tier, and what to watch for in the months ahead.

34 Models Available
10 Providers
$0.075 Cheapest / 1M tokens
90% Avg price drop since 2023

The Complete Pricing Landscape

Here's every major AI API model ranked by input price. All prices are per 1 million tokens.

Budget Tier (Under $0.60/1M input)

These models handle most everyday tasks — chatbots, classification, summarization, content generation — at rock-bottom prices.

Model Provider Input / 1M Output / 1M Context
Gemini 2.0 Flash Lite Google $0.075 $0.30 1M
GPT-oss 20B OpenAI $0.08 $0.35 128K
Llama 3.1 8B Meta (Together.ai) $0.10 $0.10 128K
Gemini 2.0 Flash Google $0.10 $0.40 1M
Llama 4 Scout Meta (Together.ai) $0.11 $0.34 10M
DeepSeek V4 Flash DeepSeek $0.14 $0.28 1M
GPT-4o mini OpenAI $0.15 $0.60 128K
GPT-oss 120B OpenAI $0.15 $0.60 128K
Mistral Small 4 Mistral $0.15 $0.60 128K
Llama 4 Maverick Meta (Together.ai) $0.20 $0.60 10M
GPT-5 mini OpenAI $0.25 $2.00 272K
DeepSeek V3 DeepSeek $0.27 $1.10 128K
DeepSeek V4 Pro DeepSeek $0.44 $0.87 1M
Mistral Large 3 Mistral $0.50 $1.50 128K
Command R Cohere $0.50 $1.50 128K

Mid Tier ($0.50–$3.00/1M input)

The sweet spot for production workloads. These models offer strong reasoning quality at reasonable prices.

Model Provider Input / 1M Output / 1M Context
Kimi K2.6 Moonshot $0.90 $3.75 256K
Claude Haiku 4.5 Anthropic $1.00 $5.00 200K
Gemini 2.5 Pro Google $1.25 $10.00 1M
GPT-5 OpenAI $1.25 $10.00 272K
GPT-5.3 Codex OpenAI $1.75 $14.00 400K
Gemini 3.1 Pro Google $2.00 $12.00 1M
Jamba 1.5 Large AI21 $2.00 $8.00 256K
GPT-4o OpenAI $2.50 $10.00 128K
Command R+ Cohere $2.50 $10.00 128K
Claude Sonnet 4.6 Anthropic $3.00 $15.00 1M
Claude Sonnet 4 Anthropic $3.00 $15.00 200K
Grok 3 Mini xAI $3.00 $5.00 128K
Llama 3.1 70B Meta (Together.ai) $0.88 $0.88 128K

Premium Tier ($5.00+/1M input)

For complex reasoning, code generation, and high-stakes tasks where quality matters most.

Model Provider Input / 1M Output / 1M Context
Claude Opus 4.7 Anthropic $5.00 $25.00 1M
GPT-5.5 OpenAI $5.00 $30.00 1M
Claude 4 Opus Anthropic $15.00 $75.00 200K
Grok 3 xAI $30.00 $150.00 128K
GPT-5.5 Pro OpenAI $30.00 $180.00 1M

Key Trends This Month

1. Budget Models Keep Getting Cheaper

The floor keeps dropping. Gemini 2.0 Flash Lite at $0.075/M is now the cheapest production-ready AI API. That's 7.5 cents per million input tokens — less than a penny for 133,000 tokens of text. A year ago, the cheapest comparable model was $0.15/M.

What this means: if you're building a chatbot, classifier, or content tool that processes high volumes, your costs have halved in 12 months without changing anything.

2. Context Windows Are the New Battleground

While price wars grab headlines, the real shift is in context windows. Seven models now offer 1M+ token context:

A 1M context window means you can feed an entire codebase, a full legal document, or hours of conversation history into a single API call. This changes what's possible — and it's available at budget prices.

3. The Premium Tier Is Shrinking

Only 5 models cost $5+/M input. And the quality gap between mid-tier and premium is narrowing. Claude Sonnet 4.6 ($3/$15) and Gemini 3.1 Pro ($2/$12) now handle most tasks that required Opus or GPT-5.5 a few months ago.

The exception: complex multi-step reasoning, code generation in large codebases, and high-stakes analysis still benefit from premium models. But for 80% of production workloads, mid-tier is enough.

4. Open Source Is a Legitimate Option

Meta's Llama 4 models on Together.ai offer serious competition:

For cost-sensitive applications where you control the prompt engineering, open-source models via Together.ai are hard to beat.

Best Deals by Use Case

Use Case Best Model Why
Chatbot (high volume) Gemini 2.0 Flash Lite Cheapest at $0.075/M, handles most chat tasks
Chatbot (quality) Claude Haiku 4.5 $1/M with Anthropic's quality
Code Generation Claude Sonnet 4.6 Best code quality at $3/M, 1M context
Document Analysis Gemini 2.5 Pro 1M context window at $1.25/M
Classification GPT-4o mini $0.15/M, fast, reliable for structured output
RAG / Retrieval DeepSeek V4 Flash $0.14/M with 1M context for long retrieval
Content Writing GPT-5 mini $0.25/M input, strong writing at budget price
Complex Reasoning Claude Opus 4.7 Best reasoning quality, worth the $5/M premium
Agent / Multi-step GPT-5 $1.25/M, strong tool use, 272K context
Budget Everything DeepSeek V4 Pro $0.44/M with 1M context — best all-around budget pick

Cost Comparison: What $100/Month Gets You

Here's how far $100 goes at different model tiers (assuming 1,000 tokens per request, 50/50 input/output split):

Tier Model Requests for $100 Daily Average
Budget Gemini 2.0 Flash Lite ~571,000 ~19,000/day
Budget DeepSeek V4 Flash ~476,000 ~15,900/day
Budget GPT-4o mini ~267,000 ~8,900/day
Mid Claude Haiku 4.5 ~62,500 ~2,100/day
Mid Claude Sonnet 4.6 ~22,200 ~740/day
Mid GPT-5 ~30,800 ~1,030/day
Premium Claude Opus 4.7 ~8,000 ~267/day
Premium GPT-5.5 ~7,700 ~257/day

The range is staggering: from 19,000 requests/day to 257 requests/day for the same $100 budget. Choosing the right model tier is the single biggest cost lever you have.

What to Watch in June 2026

Update: See our June 2026 AI API Pricing Guide for the latest prices, deprecation alerts, and migration recommendations.

Methodology

All pricing data in this report comes from official provider pricing pages, verified as of May 29, 2026. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.

Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported by each model. Some providers offer batch pricing or committed-use discounts not reflected here.

Calculate your exact costs

Use our free tools to see what these prices mean for your specific workload. No signup required.

Open Cost Calculator →

Related Tools

← Cost Projection Guide Claude API Pricing Guide →