What changed in Q2 2026 AI API pricing?

Q2 2026 saw GPT-4o drop 67%, Mistral drop 75%, and new budget models from Google and OpenAI. Budget models are now production-viable.

Which AI API is cheapest in Q2 2026?

DeepSeek V4 Flash ($0.14/$0.28) and Gemini 2.5 Flash ($0.075/$0.30) are the cheapest options. Budget models have improved significantly in quality.

How do I find the cheapest model for my use case?

Use APIpulse's cost calculator to compare models based on your specific usage. Input your tokens per request and monthly volume for accurate comparisons.

← Back to blog

Report April 25, 2026

LLM API Pricing Report Q2 2026: Every Model, Every Provider

⚠️ Deprecation alert: Claude 4 Opus and Claude Sonnet 4 retired on June 15, 2026. If you're using these models, see our migration guide for step-by-step instructions.

🚨 Claude 4 retired June 15: See all 42 alternatives, calculate your savings, and get migration code on our Claude 4 Migration Hub.

It's Q2 2026, and the LLM API market has never been more competitive. With 10 providers offering 42 models, prices have continued their downward trend while context windows have expanded dramatically. This report covers every model, every provider, and every price point — so you can make the right choice for your application.

Updated May 2, 2026: Prices have changed since this report was published. Grok 3 increased 10x, DeepSeek V4 Pro dropped 75%, Mistral Large 3 dropped 75%. See the May 2026 Pricing Shakeup and Pricing Changelog for the latest data.

Models Available

Providers

90%

Avg. Price Drop Since 2023

The Complete Pricing Landscape

Here's every model available as of April 2026, organized by tier:

Premium Tier — Maximum Quality

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context Window
Claude 4 Opus	Anthropic	$15.00	$75.00	200K
GPT-5	OpenAI	$1.25	$10.00	272K
Gemini 2.5 Pro	Google	$1.25	$10.00	1M
Claude Sonnet 4	Anthropic	$3.00	$15.00	200K
GPT-4o	OpenAI	$2.50	$10.00	128K
Mistral Large 3	Mistral	$2.00	$6.00	128K
Cohere Command R+	Cohere	$2.50	$10.00	128K
AI21 Jamba 1.5 Large	AI21	$2.00	$8.00	256K

Budget Tier — Maximum Value

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context Window
Gemini 2.0 Flash	Google	$0.10	$0.40	1M
GPT-4o mini	OpenAI	$0.15	$0.60	128K
Claude Haiku 4.5	Anthropic	$0.80	$4.00	200K
Mistral Small 4	Mistral	$0.10	$0.30	32K
Cohere Command R	Cohere	$0.15	$0.60	128K
Llama 3.1 70B	Together.ai	$0.88	$0.88	128K
Llama 3.1 8B	Together.ai	$0.18	$0.18	128K
GPT-5 mini	OpenAI	$0.40	$1.60	256K

Key Changes Since Q1 2026

            What's New This Quarter
            GPT-5 launched — $1.25/$10 per 1M tokens, 272K context. Premium quality at a competitive price point.
GPT-5 mini launched — $0.25/$2.00, 272K context. Cheaper than GPT-4o with more context.
Claude 4 Opus — Anthropic's flagship at $15/$75. Best-in-class for complex reasoning.
Gemini 2.5 Pro — Google's premium model at $1.25/$10 with 1M context. Best value premium model.
Mistral Small 4 — $0.10/$0.30, cheapest output tokens on the market.

        

Cheapest Model by Use Case

Chatbot (1K requests/day, 500 input + 200 output tokens each)

Monthly Cost Comparison

Gemini 2.0 Flash$1.80/mo

Mistral Small 4$1.35/mo

GPT-4o mini$2.70/mo

Cohere Command R$2.70/mo

Llama 3.1 8B$1.62/mo

Claude Haiku 4.5$6.75/mo

GPT-4o$13.50/mo

Claude Sonnet 4$18.00/mo

Code Generation (100 requests/day, 1K input + 500 output tokens each)

Monthly Cost Comparison

Gemini 2.0 Flash$3.90/mo

GPT-4o mini$5.40/mo

Claude Haiku 4.5$10.50/mo

GPT-4o$22.50/mo

Claude Sonnet 4$31.50/mo

GPT-5$18.75/mo

Claude 4 Opus$97.50/mo

Document Analysis (50 requests/day, 10K input + 2K output tokens each)

Monthly Cost Comparison

Gemini 2.0 Flash$4.50/mo

GPT-4o mini$6.75/mo

Gemini 2.5 Pro$22.50/mo

GPT-4o$52.50/mo

Claude Sonnet 4$67.50/mo

GPT-5$48.75/mo

Claude 4 Opus$262.50/mo

Provider Scorecard

Provider	Cheapest Model	Best Premium	Max Context	Best For
Google	Gemini 2.0 Flash ($0.10/$0.40)	Gemini 2.5 Pro ($1.25/$10)	1M tokens	Best value, longest context
OpenAI	GPT-4o mini ($0.15/$0.60)	GPT-5 ($1.25/$10)	272K tokens	Ecosystem, tool use, vision
Anthropic	Claude Haiku 4.5 ($1.00/$5)	Claude 4 Opus ($15/$75)	200K tokens	Code generation, reasoning
Mistral	Mistral Small 4 ($0.10/$0.30)	Mistral Large 3 ($2/$6)	128K tokens	Cheapest output, European
Cohere	Command R ($0.15/$0.60)	Command R+ ($2.50/$10)	128K tokens	RAG, enterprise
Together.ai	Llama 3.1 8B ($0.18/$0.18)	Llama 3.1 70B ($0.88/$0.88)	128K tokens	Open source, symmetric pricing
AI21	—	Jamba 1.5 Large ($2/$8)	256K tokens	Long context, hybrid architecture

Context Window Comparison

Context Window	Models	Best For
1M tokens	Gemini 2.5 Pro, Gemini 2.0 Flash	Full codebase analysis, book-length documents
272K tokens	GPT-5, GPT-5 mini, AI21 Jamba 1.5	Large document analysis, long conversations
200K tokens	Claude 4 Opus, Claude Sonnet 4, Claude Haiku 4.5	Complex multi-file tasks, extended reasoning
128K tokens	GPT-4o, GPT-4o mini, Mistral Large 3, Cohere, Llama	Most applications, chatbots, code generation
32K tokens	Mistral Small 4	Short-form tasks, classification, simple Q&A

Recommendations

For Startups (Under $50/month budget)

Start with Gemini 2.0 Flash for most tasks. It's the cheapest model with the largest context window. Use GPT-4o mini as a fallback for tasks that need OpenAI's ecosystem (function calling, vision). Total cost: $2-10/month for typical startup usage.

For Growing Companies ($50-500/month budget)

Use a tiered approach: Gemini 2.0 Flash for high-volume simple tasks, Claude Sonnet 4 or GPT-4o for complex reasoning, and Claude 4 Opus only for the most demanding tasks. This hybrid strategy can save 60-80% compared to using a single premium model.

For Enterprise (500+ developers)

Negotiate volume discounts with multiple providers. Use Gemini 2.5 Pro for document-heavy workloads (1M context), Claude 4 Opus for code review and complex reasoning, and GPT-5 for tool-use-heavy workflows. Implement model routing to automatically select the cheapest model for each task.

Calculate your exact costs. Enter your usage patterns into our calculator to see which model and provider saves you the most.

Try the APIpulse Calculator or View Full Pricing Index

🔍 Free Cost Audit — See if you're overpaying for AI APIs

🎯 Rate Your API Setup in 30 Seconds

Get an A+ to F grade on your AI API costs. See how you compare and find cheaper alternatives instantly.

Get Your Cost Score →

📊 Generate Your Personalized API Cost Report

Select your model, enter your monthly spend, and get a custom savings report with cheaper alternatives — free, in 60 seconds.

Generate My Report →

Want to optimize your AI API costs?

APIpulse Pro ($29 one-time) includes saved scenarios, cost report exports, and personalized recommendations that can save you up to 40%.

Get Pro — $29

Save money: 📊 Live API Pricing · Cost Optimizer — find out how much you could save by switching models. Free tool.

💸 Looking for DeepSeek V4 Flash Alternatives?

5 models ranked by cost — some offer better quality at similar prices.

See 5 DeepSeek V4 Flash Alternatives →

💸 Looking for Mistral Small 4 Alternatives?

5 models ranked by cost — some are 90% cheaper.

See 5 Mistral Small 4 Alternatives →

🔧 Free Embeddable Pricing Widget

Add live AI API pricing to your docs, blog, or README with one script tag. 42 models, auto-updating.

Get the Free Widget →

LLM API Pricing Report Q2 2026: Every Model, Every Provider

The Complete Pricing Landscape

Premium Tier — Maximum Quality

Budget Tier — Maximum Value

Key Changes Since Q1 2026

What's New This Quarter

Cheapest Model by Use Case

Chatbot (1K requests/day, 500 input + 200 output tokens each)

Monthly Cost Comparison

Code Generation (100 requests/day, 1K input + 500 output tokens each)

Monthly Cost Comparison

Document Analysis (50 requests/day, 10K input + 2K output tokens each)

Monthly Cost Comparison

Provider Scorecard

Context Window Comparison

Recommendations

For Startups (Under $50/month budget)

For Growing Companies ($50-500/month budget)

For Enterprise (500+ developers)

🎯 API Cost Score

Related Reading

🎯 API Cost Score

🎯 Rate Your API Setup in 30 Seconds

📊 Generate Your Personalized API Cost Report