This week's pricing landscape across 49 models from 10 providers. Find the best deals, track price changes, and optimize your AI spend.
The gap between the cheapest and most expensive models has never been wider. GPT-5.5 Pro costs 375× more per output token than Gemini 2.0 Flash Lite. Yet for many tasks — chatbots, summarization, code completion — budget models perform within 90% of premium ones. Most developers are overpaying by 60-90%.
| Model | Provider | Tier | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|---|
| Gemini 2.0 Flash Lite | Budget | $0.075 | $0.30 | 1M | |
| GPT-oss 20B | OpenAI | Budget | $0.08 | $0.35 | 128K |
| Mistral Small 4 | Mistral | Budget | $0.10 | $0.30 | 128K |
| Llama 3.1 8B | Meta | Budget | $0.10 | $0.10 | 128K |
| Gemini 2.5 Flash-Lite | Budget | $0.10 | $0.40 | 1M |
Switch from expensive models to budget alternatives and save up to 98%.
| Switch From | Switch To | Output Savings |
|---|---|---|
|
GPT-5.5 Pro $180/1M output |
DeepSeek V4 Flash $0.28/1M output |
Save 99.8% |
|
GPT-5.5 $30/1M output |
DeepSeek V4 Flash $0.28/1M output |
Save 99.1% |
|
Claude Opus 4.8 $25/1M output |
Gemini 2.5 Flash-Lite $0.40/1M output |
Save 98.4% |
|
Claude Sonnet 5 $15/1M output |
Mistral Large 3 $1.50/1M output |
Save 90% |
|
GPT-5 $10/1M output |
DeepSeek V4 Pro $0.87/1M output |
Save 91.3% |
| Model | Status | Details | Replacement |
|---|---|---|---|
| Claude Sonnet 5 | ✨ New | Added Jul 2026. $3/$15 per 1M tokens. 1M context window. | — |
| Claude Sonnet 4.6 | ⚠ Deprecated | Deprecated Jun 30, 2026. Same pricing as successor. | Claude Sonnet 5 |
| Claude 4 Opus | ⚠ Deprecated | Deprecated Jun 15, 2026. Was $15/$75 — 3× more expensive than successor. | Claude Opus 4.8 |
| Gemini 2.0 Flash | ⚠ Deprecated | Replaced by newer Flash model with better performance. | Gemini 3 Flash |
| DeepSeek V3 | ⚠ Deprecated | Replaced by V4 Flash at lower cost. | DeepSeek V4 Flash |
Cheapest model per provider, sorted by output cost.
| Provider | Cheapest Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| OpenAI | GPT-oss 20B | $0.08 | $0.35 | 128K |
| Mistral | Mistral Small 4 | $0.10 | $0.30 | 128K |
| Meta | Llama 3.1 8B | $0.10 | $0.10 | 128K |
| DeepSeek | DeepSeek V4 Flash | $0.14 | $0.28 | 1M |
| Moonshot | Kimi K2.6 | $0.95 | $4.00 | 256K |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | 200K |
Use premium models (GPT-5.5, Opus 4.8) for complex reasoning and code generation. Use budget models (Flash, DeepSeek, Mistral Small) for classification, extraction, and simple Q&A. Splitting workloads across tiers typically saves 60-80% with no quality loss on routine tasks.
Use the free Cost Monitoring Dashboard to log your API costs, spot trends, and set budgets. Or get APIpulse Pro for price alerts on all 49 models, PDF reports, and migration code.