← Back to blog

LLM API Pricing Cheat Sheet: Every Model, Every Provider (April 2026)

Stop jumping between pricing pages. Here's every major LLM API priced side by side — input costs, output costs, context windows, and real cost-per-use examples. Bookmark this page and check back when providers update their rates.

Complete Pricing Table

All prices are per 1M tokens. Data verified .

Provider Model Input Output Context Tier
OpenAI GPT-4o $2.50 $10.00 128K Premium
OpenAI GPT-4o mini $0.15 $0.60 128K Budget
Anthropic Claude Sonnet 4 $3.00 $15.00 200K Premium
Anthropic Claude Haiku 4.5 $1.00 $5.00 200K Budget
Google Gemini 2.5 Pro $1.25 $10.00 1M Premium
Google Gemini 2.0 Flash $0.10 $0.40 1M Budget
Mistral Large $2.00 $6.00 128K Premium
Mistral Small $0.10 $0.30 32K Budget
Cohere Command R+ $2.50 $10.00 128K Premium
Cohere Command R $0.15 $0.60 128K Budget
Meta (Together.ai) Llama 3.1 70B $0.88 $0.88 128K Budget
Meta (Together.ai) Llama 3.1 8B $0.18 $0.18 128K Budget
AI21 Jamba 1.5 Large $2.00 $8.00 256K Premium

Cheapest Models by Tier

Budget Tier (Under $1/M input)

Ranked by total cost per 1M tokens (input + output)
1. Mistral Small 4 $0.40 total ($0.10 in / $0.30 out)
2. Gemini 2.0 Flash $0.50 total ($0.10 in / $0.40 out)
3. Llama 3.1 8B (Together) $0.36 total ($0.18 in / $0.18 out)
4. GPT-4o mini $0.75 total ($0.15 in / $0.60 out)
5. Cohere Command R $0.75 total ($0.15 in / $0.60 out)
6. Claude Haiku 4.5 $6.00 total ($1.00 in / $5.00 out)
7. Llama 3.1 70B (Together) $1.76 total ($0.88 in / $0.88 out)

Premium Tier ($1+/M input)

Ranked by total cost per 1M tokens (input + output)
1. Gemini 2.5 Pro $11.25 total ($1.25 in / $10.00 out)
2. Mistral Large 3 $8.00 total ($2.00 in / $6.00 out)
3. AI21 Jamba 1.5 Large $10.00 total ($2.00 in / $8.00 out)
4. GPT-4o $12.50 total ($2.50 in / $10.00 out)
5. Cohere Command R+ $12.50 total ($2.50 in / $10.00 out)
6. Claude Sonnet 4 $18.00 total ($3.00 in / $15.00 out)

Real-World Cost Examples

Here's what you'd actually pay for common workloads. Assumes 1,000 requests/day with 500 input tokens and 200 output tokens per request.

Chatbot (1K requests/day)

Monthly cost at 500 input + 200 output tokens per request
Gemini 2.0 Flash $1.05/mo
GPT-4o mini $1.58/mo
Claude Haiku 4.5 $6.90/mo
GPT-4o $26.25/mo
Claude Sonnet 4 $37.50/mo
Budget pick: Gemini 2.0 Flash $1.05/mo

Code Generation (1K requests/day)

Monthly cost at 1,000 input + 500 output tokens per request
Gemini 2.0 Flash $3.75/mo
Llama 3.1 70B $7.92/mo
GPT-4o $75.00/mo
Claude Sonnet 4 $112.50/mo
Budget pick: Gemini 2.0 Flash $3.75/mo

Document Analysis (100 requests/day)

Monthly cost at 10,000 input + 2,000 output tokens per request
Gemini 2.0 Flash $3.30/mo
Gemini 2.5 Pro $9.75/mo
GPT-4o $13.50/mo
Claude Sonnet 4 $18.00/mo
Best value for long docs: Gemini 2.5 Pro $9.75/mo (1M context)

Context Window Comparison

Context Window Models Best For
32K Mistral Small 4 Short prompts, classification, simple Q&A
128K GPT-4o, GPT-4o mini, Mistral Large 3, Cohere Command R/R+, Llama 3.1 Most use cases, multi-turn chat, code generation
200K Claude Sonnet 4, Claude Haiku 4.5 Long documents, large codebases, book-length analysis
256K AI21 Jamba 1.5 Large Very long documents, legal contracts, research papers
1M Gemini 2.5 Pro, Gemini 2.0 Flash Entire codebases, video analysis, massive datasets

Quick Decision Guide

How to Use This Data

Don't just pick the cheapest model. Use the APIpulse Calculator to model your specific usage pattern. The right model depends on your input/output ratio, request volume, and quality requirements.

A model that costs 5x more but produces results that need no editing can actually be cheaper than a budget model that requires human review.

Calculate your exact monthly cost with your real usage numbers.

Try the APIpulse Calculator

Related Reading

Get notified when API prices change

No spam. Only pricing updates and new features. Unsubscribe anytime.