AI API Pricing July 2026: Complete Guide to All 32 Models
32 models. 10 providers. Prices from $0.075/M to $180/M. Two legacy models gone. Here's the post-deprecation pricing landscape.
July 2026 is the first month without Claude 4 Opus and Claude Sonnet 4. Anthropic retired both on June 15, leaving the market with 32 models across 10 providers. The budget tier continues to compress — Gemini Flash Lite still leads at $0.075/M — while the mid-tier has never been more competitive.
This guide covers every AI API model's current pricing, the post-deprecation landscape, mid-year pricing trends, and where the best deals are right now.
📋 Post-Deprecation: What Changed on June 15
Claude 4 Opus ($15/$75 per 1M tokens, 200K context) and Claude Sonnet 4 ($3/$15 per 1M tokens, 200K context) were retired on June 15, 2026. The official replacements:
- Claude 4 Opus → Claude Opus 4.7 or 4.8 ($5/$25) — 67% cheaper, 5x context
- Claude Sonnet 4 → Claude Sonnet 4.6 ($3/$15) — same price, 5x context
Both replacements are strictly better: same or lower price, better quality, 1M context windows. If you haven't migrated yet, do it now.
Complete Pricing: All 32 Models
Every major AI API model ranked by input price. All prices per 1 million tokens.
Budget Tier — Under $0.60/1M Input
Rock-bottom prices for everyday tasks. Chatbots, classifiers, content tools — start here.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| GPT-oss 20B | OpenAI | $0.08 | $0.35 | 128K |
| Llama 3.1 8B | Meta (Together.ai) | $0.10 | $0.10 | 128K |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | |
| Llama 4 Scout | Meta (Together.ai) | $0.11 | $0.34 | 10M |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K |
| GPT-oss 120B | OpenAI | $0.15 | $0.60 | 128K |
| Mistral Small 4 | Mistral | $0.15 | $0.60 | 128K |
| Llama 4 Maverick | Meta (Together.ai) | $0.20 | $0.60 | 10M |
| GPT-5 mini | OpenAI | $0.25 | $2.00 | 272K |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
| DeepSeek V4 Pro | DeepSeek | $0.44 | $0.87 | 1M |
| Mistral Large 3 | Mistral | $0.50 | $1.50 | 128K |
| Command R | Cohere | $0.50 | $1.50 | 128K |
| Grok Build 0.1 | xAI | $0.30 | $0.50 | 256K |
Mid Tier — $0.50–$3.00/1M Input
The sweet spot for production workloads. Strong reasoning at reasonable prices.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Llama 3.1 70B | Meta (Together.ai) | $0.88 | $0.88 | 128K |
| Kimi K2.6 | Moonshot | $0.90 | $3.75 | 256K |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | |
| GPT-5 | OpenAI | $1.25 | $10.00 | 272K |
| GPT-5.3 Codex | OpenAI | $1.75 | $14.00 | 400K |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M | |
| Jamba 1.5 Large | AI21 | $2.00 | $8.00 | 256K |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K |
| Command R+ | Cohere | $2.50 | $10.00 | 128K |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1M |
| Grok 4.3 | xAI | $1.25 | $2.50 | 1M |
Premium Tier — $5.00+/1M Input
For complex reasoning, code generation, and high-stakes tasks where quality is non-negotiable.
| Model | Provider | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Claude Opus 4.8 | Anthropic | $5.00 | $25.00 | 1M |
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | 1M |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | 1M |
| GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 1M |
Mid-Year Pricing Trends
1. The Post-Deprecation Market
With Claude 4 Opus gone, the most expensive remaining model is GPT-5.5 Pro at $30/$180 per 1M tokens. But nobody should be paying $15/M for a 200K-context model anymore — Claude Opus 4.8 at $5/M with 1M context is strictly superior to the old Opus in every dimension.
2. Budget Tier Compression
The gap between the cheapest and most capable budget models continues to shrink. Gemini 2.0 Flash Lite at $0.075/M is now within 2x of Llama 3.1 8B ($0.10/M), and both support 1M+ context. A year ago, $0.15/M was the floor. The 133,000-tokens-per-penny mark has been broken.
3. The Mid-Tier Sweet Spot
Models in the $1-3/M range now dominate production use cases. Claude Haiku 4.5 ($1/M) and GPT-5 ($1.25/M) deliver near-premium quality at 4-10x lower cost. For most teams, there's no reason to go premium unless you need the absolute best reasoning.
4. Open Source Catches Up
Meta's Llama 4 Scout ($0.11/M, 10M context) is the only model offering 10M token context at any price. Llama 4 Maverick ($0.20/M) handles general workloads well. Together.ai's managed hosting makes these zero-ops options for teams that want open-source economics without infrastructure burden.
Best Deals by Use Case
| Use Case | Best Model | Why |
|---|---|---|
| High-volume chatbot | Gemini 2.0 Flash Lite | Cheapest at $0.075/M, handles most chat tasks |
| Quality chatbot | Claude Haiku 4.5 | $1/M with Anthropic's quality |
| Code generation | Claude Sonnet 4.6 | Best code quality at $3/M, 1M context |
| Document analysis | Gemini 2.5 Pro | 1M context at $1.25/M |
| Classification / extraction | GPT-4o mini | $0.15/M, fast, reliable structured output |
| RAG / retrieval | DeepSeek V4 Flash | $0.14/M with 1M context |
| Content writing | GPT-5 mini | $0.25/M, strong writing at budget price |
| Complex reasoning | Claude Opus 4.7 | Best reasoning quality at $5/M |
| Agent / multi-step | GPT-5 | $1.25/M, strong tool use, 272K context |
| Long documents (10M+) | Llama 4 Scout | Only model with 10M context at $0.11/M |
| Budget all-around | DeepSeek V4 Pro | $0.44/M with 1M context — best value pick |
What $100/Month Gets You in July 2026
Assuming 1,000 tokens per request with a 50/50 input/output split:
| Tier | Model | Requests for $100 | Daily Average |
|---|---|---|---|
| Budget | Gemini 2.0 Flash Lite | ~571,000 | ~19,000/day |
| Budget | DeepSeek V4 Flash | ~476,000 | ~15,900/day |
| Budget | Llama 4 Scout | ~434,000 | ~14,500/day |
| Mid | Claude Haiku 4.5 | ~62,500 | ~2,100/day |
| Mid | GPT-5 | ~30,800 | ~1,030/day |
| Mid | Claude Sonnet 4.6 | ~22,200 | ~740/day |
| Premium | Claude Opus 4.7 | ~8,000 | ~267/day |
| Premium | GPT-5.5 | ~7,700 | ~257/day |
The range: 19,000 requests/day to 257 requests/day for the same $100 budget. Model selection is the single biggest cost lever you have.
Provider Comparison at a Glance
| Provider | Models | Cheapest | Most Expensive | Best For |
|---|---|---|---|---|
| OpenAI | 9 | $0.08/M | $180/M | Widest range, agents |
| Anthropic | 4 | $1.00/M | $25/M | Code, reasoning |
| 4 | $0.075/M | $12/M | Budget, long context | |
| DeepSeek | 3 | $0.14/M | $1.10/M | Budget all-around |
| Meta (Together.ai) | 4 | $0.10/M | $0.88/M | Open source, 10M context |
| Mistral | 2 | $0.15/M | $1.50/M | European compliance |
| Cohere | 2 | $0.50/M | $10/M | RAG, enterprise search |
| Moonshot | 1 | $0.90/M | $3.75/M | Long context (256K) |
| xAI | 2 | $3.00/M | $150/M | Real-time data |
| AI21 | 1 | $2.00/M | $8/M | Long context (256K) |
What to Watch in August 2026
- Anthropic post-deprecation pricing — will Opus 4.8 pricing drop now that the old Opus is gone and competition intensifies?
- DeepSeek V5 — V4 Pro at $0.44/M is aggressive; V5 could disrupt the budget tier further
- xAI pricing rebrand — Grok 4.3 at $1.25/$2.50 is now competitive; Grok Build 0.1 at $0.30/$0.50 joins the budget tier
- OpenAI's open-source response — GPT-oss models may see deeper cuts to compete with Llama 4
- New model launches — Q3 typically sees major releases from OpenAI and Google
Methodology
All pricing data comes from official provider pricing pages, verified as of May 29, 2026. We track 32 active models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.
Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported. Some providers offer batch pricing or committed-use discounts not reflected here.
Calculate your exact costs
Use our free tools to see what these prices mean for your workload. No signup required.
Open Cost Calculator →Related Tools
- AI API Cost Calculator — estimate costs for any model
- Cost Explorer — see all models ranked by cost
- Model Compare — side-by-side model comparison
- Pricing Index — complete sortable pricing database
- Cheapest AI API Finder — find the lowest-cost option
- AI API Pricing June 2026 — previous month's guide
- State of LLM Pricing Report — interactive June report