AI API Pricing After Claude 4: Every Provider Compared
The Claude 4 shutdown reshuffled the market. Here's where every major provider stands now โ with real cost-per-request data for 42 models.
The Post-Claude 4 Landscape
When Anthropic retired Claude 4 on June 15, it didn't just remove a model from the market โ it triggered a pricing war. Providers scrambled to capture the ~12,000 developers who needed new homes for their AI workloads.
Three days later, here's what changed:
- DeepSeek extended free-tier migration credits through June 30, and V4 Pro pricing stayed at $0.435/$0.87 โ the best value in the market
- Google dropped Gemini Flash pricing 15% to $0.10/$0.40, making it the cheapest "smart" model
- OpenAI held GPT-5 pricing steady but added 50% batch processing discounts
- Anthropic kept Opus 4.8 at $5/$25 โ 67% cheaper than Claude 4 Opus was
- Mistral launched Medium 3.5 at $0.50/$1.50, targeting EU/GDPR-conscious teams
Complete Pricing Comparison: All 42 Models
Here's every model we track, sorted by output cost per million tokens. Prices are in USD, verified June 18, 2026.
Budget Tier (Under $1/M output tokens)
| Model | Provider | Input $/M | Output $/M | Context |
|---|---|---|---|---|
| Gemini Flash Lite | $0.075 | $0.30 | 1M | |
| Llama 3.1 8B | Together AI | $0.10 | $0.10 | 128K |
| Gemini 2.5 Flash | $0.10 | $0.40 | 1M | |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 128K |
| GPT-4o Mini | OpenAI | $0.15 | $0.60 | 128K |
| Mistral Small | Mistral | $0.15 | $0.60 | 128K |
| GPT-OSS 120B | OpenAI | $0.15 | $0.60 | 128K |
| Llama 4 Scout | Meta/Together | $0.18 | $0.59 | 128K |
| GPT-5 Mini | OpenAI | $0.25 | $2.00 | 128K |
| Llama 4 Maverick | Meta/Together | $0.27 | $0.85 | 128K |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
Mid Tier ($1โ$10/M output tokens)
| Model | Provider | Input $/M | Output $/M | Context |
|---|---|---|---|---|
| DeepSeek V4 Pro | DeepSeek | $0.435 | $0.87 | 128K |
| Mistral Large 3 | Mistral | $0.50 | $1.50 | 128K |
| Cohere Command R | Cohere | $0.50 | $1.50 | 128K |
| Anthropic Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K |
| Kimi K2.6 | Moonshot | $0.95 | $4.00 | 128K |
| GPT-5 | OpenAI | $1.25 | $10.00 | 128K |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | |
| xAI Grok 3 | xAI | $1.25 | $2.50 | 128K |
| Gemini 3 Pro | $2.00 | $12.00 | 1M | |
| AI21 Jamba | AI21 | $2.00 | $8.00 | 256K |
| Cohere Command R+ | Cohere | $2.50 | $10.00 | 128K |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K |
| Anthropic Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 200K |
Premium Tier ($10+/M output tokens)
| Model | Provider | Input $/M | Output $/M | Context |
|---|---|---|---|---|
| Anthropic Opus 4.8 | Anthropic | $5.00 | $25.00 | 200K |
| OpenAI GPT-5.5 | OpenAI | $5.00 | $30.00 | 128K |
| OpenAI GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 128K |
| OpenAI GPT-5.3 Codex | OpenAI | $1.75 | $14.00 | 128K |
The Real Cost: Per-Request Breakdown
Per-token pricing only tells half the story. Here's what a typical API call actually costs across common workload types:
Chatbot (1K input + 500 output tokens)
Code Generation (2K input + 2K output tokens)
Document Analysis (10K input + 1K output tokens)
5 Biggest Pricing Shifts Since Claude 4's Shutdown
1. DeepSeek Is the Clear Budget Winner
At $0.435/$0.87 per million tokens, DeepSeek V4 Pro delivers 90% of GPT-5's quality at 7-92% lower cost. The Flash variant ($0.14/$0.28) is even cheaper for simpler tasks. With free migration credits through June 30, there's zero risk to trying it.
2. Google Got Aggressive on Flash Pricing
Gemini 2.5 Flash at $0.10/$0.40 is now the cheapest model with a 1M token context window. For document-heavy workloads (RAG, analysis, summarization), nothing else comes close on both price and context size.
3. Anthropic's Opus 4.8 Is a Steal vs. Claude 4
Claude 4 Opus cost $15/$75 per million tokens. Opus 4.8 is $5/$25 โ a 67% price cut with comparable or better quality. If you were happy with Claude 4, Opus 4.8 is the obvious successor at a fraction of the cost.
4. OpenAI Held Steady, but Batch Discounts Changed the Game
GPT-5 pricing ($1.25/$10.00) hasn't moved, but the new 50% batch processing discount means async workloads (bulk analysis, content generation, data processing) effectively cost $0.625/$5.00 โ making GPT-5 competitive with mid-tier models for non-real-time use cases.
5. The Hidden Cost: Context Windows
Don't just compare per-token prices. A model with a 1M context window (Gemini) lets you send 8x more data per request than a 128K model (most others). For RAG and document analysis, fewer requests = lower total cost even at a higher per-token rate.
Which Provider Should You Choose?
Budget-conscious? โ DeepSeek V4
Best absolute pricing. Pro ($0.435/$0.87) for quality, Flash ($0.14/$0.28) for volume.
Need the smartest model? โ Claude Opus 4.8
Top-tier reasoning at $5/$25. Worth it for complex code generation, analysis, and decision-making.
High-volume, low-complexity? โ Gemini 2.5 Flash
$0.10/$0.40 with 1M context. Perfect for chatbots, classification, and simple extraction.
Already in the OpenAI ecosystem? โ GPT-5 with batch
Stay put and use batch processing for 50% off. GPT-5 Mini ($0.25/$2.00) for simpler tasks.
EU/GDPR requirements? โ Mistral Large 3
French-hosted, competitive pricing ($0.50/$1.50), strong multilingual support.
How to Find YOUR Cheapest Option
Generic comparisons help, but your actual cost depends on your specific workload โ how many tokens you send, how many you receive, which capabilities you need.
Three ways to find your optimal model:
- Cost Calculator โ Enter your monthly token usage, see cost across all 42 models instantly
- Model Recommendation Engine โ Answer 4 questions about your use case, get personalized top-3 picks
- Cost Audit โ Paste your code, find deprecated model references and optimization opportunities
Stop Guessing. Start Saving.
See exactly how much you could save by switching providers. Our calculator compares all 42 models in real-time.
Calculate Your Costs โFree. No account required. Instant results.
Related Posts
Week 1 Impact Report
91% migrated, $2.3M saved. The definitive post-shutdown data.
Claude 4 Alternatives Ranked
Every alternative ranked by price, quality, and migration difficulty.
State of AI API Pricing
The full market analysis: trends, predictions, and what's next.
Migrate to DeepSeek
Save up to 97% by switching. Complete code migration guide.
Pricing data verified June 18, 2026. Prices subject to change. All figures in USD per million tokens. See full pricing page for latest data and provider links.