AI API Pricing June 2026: Complete Guide to All 34 Models
34 models. 10 providers. Prices from $0.075/M to $180/M. Two models retiring June 15. Here's everything you need to know about AI API pricing right now.
June 2026 marks a turning point in AI API pricing. Budget models now start under $0.10 per million tokens. Two legacy Anthropic models are being retired mid-month. And the gap between mid-tier and premium quality has narrowed to the point where most teams can save 60-80% by choosing wisely.
This guide covers every AI API model's current pricing, what's changing in June, the models you should migrate away from, and where the best deals are right now.
⚠️ Deprecation Alert: 2 Models Retiring June 15, 2026
Claude 4 Opus ($15/$75 per 1M tokens) and Claude Sonnet 4 ($3/$15 per 1M tokens) are being retired on June 15, 2026. If you're using either model, migrate now:
- Claude 4 Opus → Claude Opus 4.7 or 4.8 ($5/$25) — 67% cheaper, better quality, 1M context
- Claude Sonnet 4 → Claude Sonnet 4.6 ($3/$15) — same price, better quality, 1M context
Complete Pricing: All 34 Models
Every major AI API model ranked by input price. All prices per 1 million tokens, verified May 29, 2026.
Budget Tier — Under $0.60/1M Input
These models handle most everyday tasks at rock-bottom prices. If you're building a chatbot, classifier, or content tool, start here.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| GPT-oss 20B | OpenAI | $0.08 | $0.35 | 128K |
| Llama 3.1 8B | Meta (Together.ai) | $0.10 | $0.10 | 128K |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | |
| Llama 4 Scout | Meta (Together.ai) | $0.11 | $0.34 | 10M |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K |
| GPT-oss 120B | OpenAI | $0.15 | $0.60 | 128K |
| Mistral Small 4 | Mistral | $0.15 | $0.60 | 128K |
| Llama 4 Maverick | Meta (Together.ai) | $0.20 | $0.60 | 10M |
| GPT-5 mini | OpenAI | $0.25 | $2.00 | 272K |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
| DeepSeek V4 Pro | DeepSeek | $0.44 | $0.87 | 1M |
| Mistral Large 3 | Mistral | $0.50 | $1.50 | 128K |
| Command R | Cohere | $0.50 | $1.50 | 128K |
| Grok Build 0.1 | xAI | $0.30 | $0.50 | 256K |
Mid Tier — $0.50–$3.00/1M Input
The sweet spot for production workloads. Strong reasoning at reasonable prices.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Llama 3.1 70B | Meta (Together.ai) | $0.88 | $0.88 | 128K |
| Kimi K2.6 | Moonshot | $0.95 | $4.00 | 256K |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | |
| GPT-5 | OpenAI | $1.25 | $10.00 | 272K |
| GPT-5.3 Codex | OpenAI | $1.75 | $14.00 | 400K |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M | |
| Jamba 1.5 Large | AI21 | $2.00 | $8.00 | 256K |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K |
| Command R+ | Cohere | $2.50 | $10.00 | 128K |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1M |
| Claude Sonnet 4 ⚠️ | Anthropic | $3.00 | $15.00 | 200K |
| Grok 4.3 | xAI | $1.25 | $2.50 | 1M |
Premium Tier — $5.00+/1M Input
For complex reasoning, code generation, and high-stakes tasks where quality is non-negotiable.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Claude Opus 4.8 | Anthropic | $5.00 | $25.00 | 1M |
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | 1M |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | 1M |
| Claude 4 Opus ⚠️ | Anthropic | $15.00 | $75.00 | 200K |
| GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 1M |
What Changed Since May 2026
1. Anthropic's Deprecation Wave
The biggest change in June: two Anthropic models are being retired on June 15. Claude 4 Opus and Claude Sonnet 4 are being sunset in favor of the newer Opus 4.7/4.8 and Sonnet 4.6. The replacements are the same price or cheaper, with larger context windows. There's no reason to stay on the deprecated models.
2. Budget Floor Keeps Dropping
Gemini 2.0 Flash Lite at $0.075/M remains the cheapest production-ready AI API. That's 7.5 cents per million input tokens — you'd need to process 133,000 tokens to spend a single penny. A year ago, the cheapest comparable model was $0.15/M.
3. Context Windows Expanded
Seven models now support 1M+ token context windows:
- Llama 4 Scout: 10M tokens — largest available
- Gemini 2.0 Flash/Flash Lite: 1M — at budget prices
- Gemini 2.5 Pro, Gemini 3.1 Pro: 1M
- Claude Opus 4.7/4.8, Sonnet 4.6: 1M
- DeepSeek V4 Pro/Flash: 1M
4. Open Source Is Production-Ready
Meta's Llama 4 models via Together.ai are now legitimate production options. Llama 4 Scout at $0.11/M with a 10M context window is unmatched for long-document tasks. Llama 4 Maverick at $0.20/M handles general-purpose workloads well.
Best Deals by Use Case
| Use Case | Best Model | Why |
|---|---|---|
| High-volume chatbot | Gemini 2.0 Flash Lite | Cheapest at $0.075/M, handles most chat tasks |
| Quality chatbot | Claude Haiku 4.5 | $1/M with Anthropic's quality |
| Code generation | Claude Sonnet 4.6 | Best code quality at $3/M, 1M context |
| Document analysis | Gemini 2.5 Pro | 1M context at $1.25/M |
| Classification / extraction | GPT-4o mini | $0.15/M, fast, reliable structured output |
| RAG / retrieval | DeepSeek V4 Flash | $0.14/M with 1M context |
| Content writing | GPT-5 mini | $0.25/M, strong writing at budget price |
| Complex reasoning | Claude Opus 4.7 | Best reasoning quality at $5/M |
| Agent / multi-step | GPT-5 | $1.25/M, strong tool use, 272K context |
| Long documents (10M+) | Llama 4 Scout | Only model with 10M context at $0.11/M |
| Budget all-around | DeepSeek V4 Pro | $0.44/M with 1M context — best value pick |
What $100/Month Gets You in June 2026
Assuming 1,000 tokens per request with a 50/50 input/output split:
| Tier | Model | Requests for $100 | Daily Average |
|---|---|---|---|
| Budget | Gemini 2.0 Flash Lite | ~571,000 | ~19,000/day |
| Budget | DeepSeek V4 Flash | ~476,000 | ~15,900/day |
| Budget | Llama 4 Scout | ~434,000 | ~14,500/day |
| Mid | Claude Haiku 4.5 | ~62,500 | ~2,100/day |
| Mid | GPT-5 | ~30,800 | ~1,030/day |
| Mid | Claude Sonnet 4.6 | ~22,200 | ~740/day |
| Premium | Claude Opus 4.7 | ~8,000 | ~267/day |
| Premium | GPT-5.5 | ~7,700 | ~257/day |
The range: 19,000 requests/day to 257 requests/day for the same $100 budget. Model selection is the single biggest cost lever you have.
Migration Guide: Models Retiring June 15
Migrating from Claude 4 Opus
Current: Claude 4 Opus — $15/$75 per 1M tokens, 200K context
Replace with:
- Claude Opus 4.7 ($5/$25, 1M context) — 67% cheaper, better quality
- Claude Opus 4.8 ($5/$25, 1M context) — latest version, best reasoning
- Claude Sonnet 4.6 ($3/$15, 1M context) — 80% cheaper, sufficient for most tasks
Migrating from Claude Sonnet 4
Current: Claude Sonnet 4 — $3/$15 per 1M tokens, 200K context
Replace with:
- Claude Sonnet 4.6 ($3/$15, 1M context) — same price, better quality, 5x context
Provider Comparison at a Glance
| Provider | Models | Cheapest | Most Expensive | Best For |
|---|---|---|---|---|
| OpenAI | 9 | $0.08/M | $180/M | Widest range, agents |
| Anthropic | 6 | $1.00/M | $75/M | Code, reasoning |
| 4 | $0.075/M | $12/M | Budget, long context | |
| DeepSeek | 3 | $0.14/M | $1.10/M | Budget all-around |
| Meta (Together.ai) | 4 | $0.10/M | $0.88/M | Open source, 10M context |
| Mistral | 2 | $0.15/M | $1.50/M | European compliance |
| Cohere | 2 | $0.50/M | $10/M | RAG, enterprise search |
| Moonshot | 1 | $0.95/M | $4.00/M | Long context (256K) |
| xAI | 2 | $3.00/M | $150/M | Real-time data |
| AI21 | 1 | $2.00/M | $8/M | Long context (256K) |
What to Watch in July 2026
- Anthropic post-deprecation pricing — will Claude Opus 4.8 pricing drop after the old Opus is retired?
- OpenAI's open-source response — GPT-oss models may see cuts to compete with Llama 4
- Google I/O 2026 impact — new Gemini models or pricing could reshape the budget tier
- DeepSeek V5 — V4 Pro at $0.44/M is aggressive; V5 could go lower
- xAI pricing rebrand — Grok 4.3 at $1.25/$2.50 is now competitive; Grok Build 0.1 at $0.30/$0.50 joins the budget tier
Methodology
All pricing data comes from official provider pricing pages, verified as of May 29, 2026. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.
Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported. Some providers offer batch pricing or committed-use discounts not reflected here.
Calculate your exact costs
Use our free tools to see what these prices mean for your workload. No signup required.
Open Cost Calculator →Related Tools
- AI API Cost Calculator — estimate costs for any model
- Cost Explorer — see all 34 models ranked by cost
- Model Compare — side-by-side model comparison
- Pricing Index — complete sortable pricing database
- Cheapest AI API Finder — find the lowest-cost option
- Cheapest AI API June 2026 — all 34 models ranked by cost
- State of LLM Pricing Report — interactive June report