LLM API Pricing Cheat Sheet: Every Model, Every Provider (April 2026)
Stop jumping between pricing pages. Here's every major LLM API priced side by side — input costs, output costs, context windows, and real cost-per-use examples. Bookmark this page and check back when providers update their rates.
Complete Pricing Table
All prices are per 1M tokens. Data verified .
| Provider | Model | Input | Output | Context | Tier |
|---|---|---|---|---|---|
| OpenAI | GPT-4o | $2.50 | $10.00 | 128K | Premium |
| OpenAI | GPT-4o mini | $0.15 | $0.60 | 128K | Budget |
| Anthropic | Claude Sonnet 4 | $3.00 | $15.00 | 200K | Premium |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | 200K | Budget |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | Premium | |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | Budget | |
| Mistral | Large | $2.00 | $6.00 | 128K | Premium |
| Mistral | Small | $0.10 | $0.30 | 32K | Budget |
| Cohere | Command R+ | $2.50 | $10.00 | 128K | Premium |
| Cohere | Command R | $0.15 | $0.60 | 128K | Budget |
| Meta (Together.ai) | Llama 3.1 70B | $0.88 | $0.88 | 128K | Budget |
| Meta (Together.ai) | Llama 3.1 8B | $0.18 | $0.18 | 128K | Budget |
| AI21 | Jamba 1.5 Large | $2.00 | $8.00 | 256K | Premium |
Cheapest Models by Tier
Budget Tier (Under $1/M input)
Premium Tier ($1+/M input)
Real-World Cost Examples
Here's what you'd actually pay for common workloads. Assumes 1,000 requests/day with 500 input tokens and 200 output tokens per request.
Chatbot (1K requests/day)
Code Generation (1K requests/day)
Document Analysis (100 requests/day)
Context Window Comparison
| Context Window | Models | Best For |
|---|---|---|
| 32K | Mistral Small 4 | Short prompts, classification, simple Q&A |
| 128K | GPT-4o, GPT-4o mini, Mistral Large 3, Cohere Command R/R+, Llama 3.1 | Most use cases, multi-turn chat, code generation |
| 200K | Claude Sonnet 4, Claude Haiku 4.5 | Long documents, large codebases, book-length analysis |
| 256K | AI21 Jamba 1.5 Large | Very long documents, legal contracts, research papers |
| 1M | Gemini 2.5 Pro, Gemini 2.0 Flash | Entire codebases, video analysis, massive datasets |
Quick Decision Guide
- Cheapest overall: Mistral Small 4 ($0.10/$0.30) — but only 32K context
- Cheapest with decent context: Gemini 2.0 Flash ($0.10/$0.40) — 1M context at budget price
- Best quality per dollar (premium): Gemini 2.5 Pro ($1.25/$10.00) — cheapest premium with 1M context
- Best for code: Claude Sonnet 4 ($3.00/$15.00) — strongest coding benchmarks
- Best for chat: GPT-4o ($2.50/$10.00) — most natural conversation
- Best open-source option: Llama 3.1 70B via Together.ai ($0.88/$0.88) — symmetric pricing
- Best for long documents: Gemini 2.5 Pro — 1M context window eliminates chunking
How to Use This Data
Don't just pick the cheapest model. Use the APIpulse Calculator to model your specific usage pattern. The right model depends on your input/output ratio, request volume, and quality requirements.
A model that costs 5x more but produces results that need no editing can actually be cheaper than a budget model that requires human review.
Calculate your exact monthly cost with your real usage numbers.
Try the APIpulse CalculatorGet notified when API prices change
No spam. Only pricing updates and new features. Unsubscribe anytime.