AI API Pricing 2026: Every Model Ranked by Cost
42 models. 10 providers. From $0.075 to $180 per million output tokens. Here's the definitive ranking of every AI API you can buy right now.
AI API pricing has become a maze. OpenAI alone has 9 models. Google has 7. Every provider has multiple tiers, and the "obvious" choice is rarely the cheapest. We track every model daily at APIpulse โ here's what the data says.
The Complete Ranking: Cheapest to Most Expensive
All prices are per million tokens. Active (non-deprecated) models only. Data verified Jun 21, 2026.
| # | Model | Provider | Tier | Input/M | Output/M | Context |
|---|---|---|---|---|---|---|
| 1 | Mistral Small 4 | Mistral | Budget | $0.10 | $0.30 | 128K |
| 2 | Gemini 2.5 Flash-Lite | Budget | $0.10 | $0.40 | 1M | |
| 3 | DeepSeek V4 Flash | DeepSeek | Budget | $0.14 | $0.28 | 1M |
| 4 | GPT-oss 20B | OpenAI | Budget | $0.08 | $0.35 | 128K |
| 5 | Llama 4 Scout | Meta (Together.ai) | Budget | $0.18 | $0.59 | 1M |
| 6 | GPT-oss 120B | OpenAI | Budget | $0.15 | $0.60 | 128K |
| 7 | GPT-4o mini | OpenAI | Budget | $0.15 | $0.60 | 128K |
| 8 | DeepSeek V3.2 | DeepSeek | Budget | $0.23 | $0.34 | 128K |
| 9 | Llama 4 Maverick | Meta (Together.ai) | Budget | $0.27 | $0.85 | 1M |
| 10 | DeepSeek V4 Pro | DeepSeek | Budget | $0.435 | $0.87 | 1M |
| 11 | GPT-5 mini | OpenAI | Budget | $0.25 | $2.00 | 272K |
| 12 | Gemini 3.1 Flash-Lite | Budget | $0.25 | $1.50 | 1M | |
| 13 | Gemini 3 Flash | Budget | $0.50 | $3.00 | 1M | |
| 14 | Mistral Large 3 | Mistral | Budget | $0.50 | $1.50 | 262K |
| 15 | Command R | Cohere | Budget | $0.50 | $1.50 | 128K |
| 16 | Grok Build 0.1 | xAI | Budget | $0.30 | $0.50 | 256K |
| 17 | Kimi K2.6 | Moonshot | Budget | $0.95 | $4.00 | 256K |
| 18 | Claude Haiku 4.5 | Anthropic | Mid | $1.00 | $5.00 | 200K |
| 19 | Gemini 2.5 Pro | Mid | $1.25 | $10.00 | 1M | |
| 20 | GPT-5 | OpenAI | Premium | $1.25 | $10.00 | 272K |
| 21 | Grok 4.3 | xAI | Mid | $1.25 | $2.50 | 1M |
| 22 | Mistral Medium 3.5 | Mistral | Mid | $1.50 | $7.50 | 128K |
| 23 | Gemini 3.5 Flash | Mid | $1.50 | $9.00 | 1M | |
| 24 | GPT-5.3 Codex | OpenAI | Mid | $1.75 | $14.00 | 400K |
| 25 | Gemini 3.1 Pro | Mid | $2.00 | $12.00 | 1M | |
| 26 | Jamba 1.7 Large | AI21 | Mid | $2.00 | $8.00 | 256K |
| 27 | GPT-4o | OpenAI | Mid | $2.50 | $10.00 | 128K |
| 28 | Command A | Cohere | Mid | $2.50 | $10.00 | 128K |
| 29 | Command R+ | Cohere | Mid | $2.50 | $10.00 | 128K |
| 30 | Claude Sonnet 4.6 | Anthropic | Mid | $3.00 | $15.00 | 1M |
| 31 | GPT-5.5 | OpenAI | Premium | $5.00 | $30.00 | 1.05M |
| 32 | Claude Opus 4.7 | Anthropic | Premium | $5.00 | $25.00 | 1M |
| 33 | Claude Opus 4.8 | Anthropic | Premium | $5.00 | $25.00 | 1M |
| 34 | GPT-5.5 Pro | OpenAI | Premium | $30.00 | $180.00 | 1.05M |
34 active models. 8 deprecated models (Claude 4 Opus, Sonnet 4, DeepSeek V3, Gemini 2.0 Flash/Lite, Jamba 1.5, Llama 3.1 variants) excluded. See full live dashboard โ
Key Takeaways
1. Output pricing varies 600ร across models
The gap between the cheapest output (Mistral Small 4 at $0.30/M) and the most expensive (GPT-5.5 Pro at $180/M) is 600ร. Input pricing varies less โ only 375ร from $0.08 to $30.00. If your workload is output-heavy (chatbots, content generation, code completion), model choice matters enormously.
2. DeepSeek is the value king
DeepSeek V4 Pro ($0.435/$0.87) delivers 1M context with competitive quality at 11.5ร cheaper output than GPT-5. Even DeepSeek V4 Flash ($0.14/$0.28) handles many tasks well. For startups and high-volume applications, DeepSeek is the default budget choice.
3. Google's Flash models are underrated
Gemini 3 Flash ($0.50/$3.00) with 1M context is an excellent mid-range option. Google also has the cheapest option overall โ Gemini 2.5 Flash-Lite at $0.10/$0.40 with 1M context. For long-document processing, Google's pricing is unbeatable.
4. Premium doesn't mean 10ร better
Claude Opus 4.8 ($5/$25) and GPT-5.5 ($5/$30) are the premium reasoning models. But for most production workloads, Claude Sonnet 4.6 ($3/$15) or Gemini 2.5 Pro ($1.25/$10) deliver 90% of the quality at 40-60% of the cost. Reserve premium models for tasks that genuinely need them.
5. Context window is a hidden cost factor
A 1M context window (DeepSeek, Gemini, Claude) means you can process entire codebases or documents in one API call. Models with 128K context (GPT-4o, Mistral Medium) may require chunking โ which multiplies your costs by the number of chunks.
Best Model by Use Case
| Use Case | Best Model | Output/M | Why |
|---|---|---|---|
| High-volume chatbot | DeepSeek V4 Flash | $0.28 | Cheapest 1M context model. Great for customer support, FAQ bots |
| Code generation | Claude Sonnet 4.6 | $15.00 | Best code quality/price ratio. 1M context for full codebase analysis |
| Long document analysis | Gemini 2.5 Flash-Lite | $0.40 | 1M context at $0.10 input. Process entire books or legal docs cheaply |
| Complex reasoning | Claude Opus 4.8 | $25.00 | Top-tier reasoning. Worth the premium for research, analysis, planning |
| Content generation at scale | Mistral Large 3 | $1.50 | Good quality at budget price. Great for marketing copy, product descriptions |
| Startups / prototyping | GPT-5 mini | $2.00 | Good enough quality, fast, OpenAI ecosystem compatibility |
| Enterprise RAG pipelines | Gemini 3 Flash | $3.00 | 1M context + budget pricing. Process large document stores efficiently |
Provider Comparison
How do the big providers stack up on pricing?
| Provider | Models | Cheapest/M (out) | Most Expensive/M (out) | Max Context |
|---|---|---|---|---|
| OpenAI | 9 | $0.35 | $180.00 | 1.05M |
| Anthropic | 5 | $5.00 | $25.00 | 1M |
| 7 | $0.30 | $12.00 | 1M | |
| DeepSeek | 4 | $0.28 | $0.87 | 1M |
| Mistral | 3 | $0.30 | $7.50 | 262K |
| xAI | 2 | $0.50 | $2.50 | 1M |
| Cohere | 3 | $1.50 | $10.00 | 128K |
| Meta (Together.ai) | 4 | $0.10 | $0.88 | 1M |
| Moonshot | 1 | $4.00 | $4.00 | 256K |
| AI21 | 1 | $8.00 | $8.00 | 256K |
๐ก Want to calculate your exact costs? Use our free API cost calculator โ enter your token usage and see monthly costs across all 42 models. Or check the live pricing dashboard for real-time data.
How to Save 50-90% on Your AI API Bill
- Audit your model usage. Most teams use GPT-5 or Claude Sonnet for tasks where a budget model would work fine. Run a cost audit to see where money goes.
- Route by complexity. Use cheap models (DeepSeek V4 Flash, Mistral Small) for simple tasks. Reserve premium models (Opus, GPT-5.5) for complex reasoning only.
- Batch non-urgent work. Process documents, generate reports, and run analysis during off-peak hours with budget models.
- Monitor output token usage. Output costs 5-20ร more than input. Short, focused prompts save money. Set max_tokens limits.
- Compare before committing. Use our 232 comparison pages to find the cheapest model that meets your quality needs.
Find Your Cheapest Model
Enter your usage. See exact monthly costs across all 42 models. Free, no signup.
Try the Cost Calculator โFAQ
What is the cheapest AI API in 2026?
Mistral Small 4 ($0.10/$0.30 per million tokens) and Gemini 2.5 Flash-Lite ($0.10/$0.40) are the cheapest active models. For the absolute cheapest, Gemini 2.0 Flash Lite ($0.075/$0.30) still exists but is deprecated โ it will be shut down soon.
How much does GPT-5 cost per million tokens?
GPT-5 costs $1.25/M input and $10.00/M output. This puts it in the premium tier for output pricing. DeepSeek V4 Pro ($0.87/M output) offers similar capability at 11ร lower cost for many tasks.
Which AI API offers the best value for money?
For budget workloads, DeepSeek V4 Pro ($0.435/$0.87) is unbeatable โ 1M context, competitive quality, 11ร cheaper than GPT-5 on output. For quality-sensitive work, Claude Sonnet 4.6 ($3/$15) offers the best quality-to-price ratio among mid-tier models.
How much can I save by switching from GPT-5?
Switching to DeepSeek V4 Pro saves 91% on output costs. To Gemini 3 Flash saves 70%. To Claude Sonnet 4.6 saves 50% on output. Even switching to GPT-5 mini saves 80% on output for simpler tasks.
Are expensive models worth the premium?
For complex reasoning, research, and critical decision-making โ yes, premium models (Opus 4.8, GPT-5.5) are measurably better. For 80% of production tasks (summarization, extraction, simple Q&A, content generation), mid-tier and budget models are sufficient.
Last updated: Jun 21, 2026. Prices verified against provider documentation. See live data โ