AI API Pricing Report: May 2026
34 models. 10 providers. Prices from $0.075/M to $180/M. Here's the complete state of AI API pricing — and where the deals are.
AI API pricing in 2026 looks nothing like it did a year ago. Budget models now cost less than $0.10 per million tokens. Premium models have halved in price. And the number of available models has exploded to 34 across 10 providers.
This report covers every major AI API model's current pricing, the trends driving prices down, the best deals in each tier, and what to watch for in the months ahead.
The Complete Pricing Landscape
Here's every major AI API model ranked by input price. All prices are per 1 million tokens.
Budget Tier (Under $0.60/1M input)
These models handle most everyday tasks — chatbots, classification, summarization, content generation — at rock-bottom prices.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| GPT-oss 20B | OpenAI | $0.08 | $0.35 | 128K |
| Llama 3.1 8B | Meta (Together.ai) | $0.10 | $0.10 | 128K |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | |
| Llama 4 Scout | Meta (Together.ai) | $0.11 | $0.34 | 10M |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K |
| GPT-oss 120B | OpenAI | $0.15 | $0.60 | 128K |
| Mistral Small 4 | Mistral | $0.15 | $0.60 | 128K |
| Llama 4 Maverick | Meta (Together.ai) | $0.20 | $0.60 | 10M |
| GPT-5 mini | OpenAI | $0.25 | $2.00 | 272K |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
| DeepSeek V4 Pro | DeepSeek | $0.44 | $0.87 | 1M |
| Mistral Large 3 | Mistral | $0.50 | $1.50 | 128K |
| Command R | Cohere | $0.50 | $1.50 | 128K |
Mid Tier ($0.50–$3.00/1M input)
The sweet spot for production workloads. These models offer strong reasoning quality at reasonable prices.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Kimi K2.6 | Moonshot | $0.90 | $3.75 | 256K |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | |
| GPT-5 | OpenAI | $1.25 | $10.00 | 272K |
| GPT-5.3 Codex | OpenAI | $1.75 | $14.00 | 400K |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M | |
| Jamba 1.5 Large | AI21 | $2.00 | $8.00 | 256K |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K |
| Command R+ | Cohere | $2.50 | $10.00 | 128K |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1M |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | 200K |
| Grok 3 Mini | xAI | $3.00 | $5.00 | 128K |
| Llama 3.1 70B | Meta (Together.ai) | $0.88 | $0.88 | 128K |
Premium Tier ($5.00+/1M input)
For complex reasoning, code generation, and high-stakes tasks where quality matters most.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | 1M |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | 1M |
| Claude 4 Opus | Anthropic | $15.00 | $75.00 | 200K |
| Grok 3 | xAI | $30.00 | $150.00 | 128K |
| GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 1M |
Key Trends This Month
1. Budget Models Keep Getting Cheaper
The floor keeps dropping. Gemini 2.0 Flash Lite at $0.075/M is now the cheapest production-ready AI API. That's 7.5 cents per million input tokens — less than a penny for 133,000 tokens of text. A year ago, the cheapest comparable model was $0.15/M.
What this means: if you're building a chatbot, classifier, or content tool that processes high volumes, your costs have halved in 12 months without changing anything.
2. Context Windows Are the New Battleground
While price wars grab headlines, the real shift is in context windows. Seven models now offer 1M+ token context:
- Llama 4 Scout: 10M tokens — the largest context window available
- Gemini 2.0 Flash/Flash Lite: 1M — at budget prices
- Gemini 2.5 Pro, Gemini 3.1 Pro: 1M
- Claude Opus 4.7, Sonnet 4.6: 1M
- DeepSeek V4 Pro/Flash: 1M
A 1M context window means you can feed an entire codebase, a full legal document, or hours of conversation history into a single API call. This changes what's possible — and it's available at budget prices.
3. The Premium Tier Is Shrinking
Only 5 models cost $5+/M input. And the quality gap between mid-tier and premium is narrowing. Claude Sonnet 4.6 ($3/$15) and Gemini 3.1 Pro ($2/$12) now handle most tasks that required Opus or GPT-5.5 a few months ago.
The exception: complex multi-step reasoning, code generation in large codebases, and high-stakes analysis still benefit from premium models. But for 80% of production workloads, mid-tier is enough.
4. Open Source Is a Legitimate Option
Meta's Llama 4 models on Together.ai offer serious competition:
- Llama 4 Scout ($0.11/$0.34) — cheapest model with a 10M context window
- Llama 4 Maverick ($0.20/$0.60) — strong general-purpose model
- Llama 3.1 70B ($0.88/$0.88) — balanced price/quality with symmetric pricing
For cost-sensitive applications where you control the prompt engineering, open-source models via Together.ai are hard to beat.
Best Deals by Use Case
| Use Case | Best Model | Why |
|---|---|---|
| Chatbot (high volume) | Gemini 2.0 Flash Lite | Cheapest at $0.075/M, handles most chat tasks |
| Chatbot (quality) | Claude Haiku 4.5 | $1/M with Anthropic's quality |
| Code Generation | Claude Sonnet 4.6 | Best code quality at $3/M, 1M context |
| Document Analysis | Gemini 2.5 Pro | 1M context window at $1.25/M |
| Classification | GPT-4o mini | $0.15/M, fast, reliable for structured output |
| RAG / Retrieval | DeepSeek V4 Flash | $0.14/M with 1M context for long retrieval |
| Content Writing | GPT-5 mini | $0.25/M input, strong writing at budget price |
| Complex Reasoning | Claude Opus 4.7 | Best reasoning quality, worth the $5/M premium |
| Agent / Multi-step | GPT-5 | $1.25/M, strong tool use, 272K context |
| Budget Everything | DeepSeek V4 Pro | $0.44/M with 1M context — best all-around budget pick |
Cost Comparison: What $100/Month Gets You
Here's how far $100 goes at different model tiers (assuming 1,000 tokens per request, 50/50 input/output split):
| Tier | Model | Requests for $100 | Daily Average |
|---|---|---|---|
| Budget | Gemini 2.0 Flash Lite | ~571,000 | ~19,000/day |
| Budget | DeepSeek V4 Flash | ~476,000 | ~15,900/day |
| Budget | GPT-4o mini | ~267,000 | ~8,900/day |
| Mid | Claude Haiku 4.5 | ~62,500 | ~2,100/day |
| Mid | Claude Sonnet 4.6 | ~22,200 | ~740/day |
| Mid | GPT-5 | ~30,800 | ~1,030/day |
| Premium | Claude Opus 4.7 | ~8,000 | ~267/day |
| Premium | GPT-5.5 | ~7,700 | ~257/day |
The range is staggering: from 19,000 requests/day to 257 requests/day for the same $100 budget. Choosing the right model tier is the single biggest cost lever you have.
What to Watch in June 2026
- Google I/O aftermath — new Gemini models or pricing changes could shift the budget tier
- OpenAI's open-source push — GPT-oss models may see price cuts to compete with Llama 4
- Anthropic's response — Claude Haiku 4.5 pricing may drop to match budget competitors
- DeepSeek's next move — V4 Pro at $0.44/M is already aggressive; watch for V5 announcements
- xAI's Grok 3 pricing — $30/$150 is the most expensive model; cuts are likely
Update: See our June 2026 AI API Pricing Guide for the latest prices, deprecation alerts, and migration recommendations.
Methodology
All pricing data in this report comes from official provider pricing pages, verified as of May 29, 2026. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.
Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported by each model. Some providers offer batch pricing or committed-use discounts not reflected here.
Calculate your exact costs
Use our free tools to see what these prices mean for your specific workload. No signup required.
Open Cost Calculator →Related Tools
- AI API Cost Calculator — estimate costs for any model
- Cost Explorer — see all 34 models ranked by cost
- Model Compare — side-by-side model comparison
- Pricing Index — complete sortable pricing database
- Cheapest AI API Finder — find the lowest-cost option