← Back to blog

Report April 30, 2026

State of AI API Pricing — Q2 2026 Report

The AI API market in Q2 2026 is the most competitive it's ever been. With 10 providers, 42 models, and prices ranging from $0.10 to $75 per million tokens, the choices — and the savings — are enormous.

This report covers every major provider, every pricing change since Q1, and the trends that will shape your costs for the rest of 2026.

Key Findings

Active Providers

Models Tracked

-23%

Avg Price Drop (Q1→Q2)

750x

Cheapest vs Most Expensive

Full Pricing Table — All 42 Models

Prices are per 1 million tokens. Sorted by input cost (cheapest first).

Model	Provider	Input	Output	Context	Q1→Q2
Mistral Small 4	Mistral	$0.10	$0.30	32K	—
Gemini 2.0 Flash	Google	$0.10	$0.40	1M	-50%
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	128K	NEW
GPT-4o mini	OpenAI	$0.15	$0.60	128K	—
Gemini 2.5 Flash	Google	$0.15	$0.60	1M	NEW
Claude Haiku 4.5	Anthropic	$1.00	$5.00	200K	—
Llama 4 Scout	Together	$0.10	$0.30	128K	NEW
Kimi K2.6	Moonshot	$0.95	$4.00	128K	NEW
Jamba 1.5 Mini	AI21	$0.20	$0.40	256K	—
Command R	Cohere	$0.15	$0.60	128K	—
Grok Build 0.1	xAI	$0.30	$0.50	256K	NEW
Mistral Large 3	Mistral	$0.50	$1.50	262K	-75%
DeepSeek V4 Pro	DeepSeek	$0.55	$2.19	128K	NEW
Command R+	Cohere	$2.50	$10.00	128K	—
GPT-4o	OpenAI	$2.50	$10.00	128K	—
Gemini 2.5 Pro	Google	$1.25	$10.00	1M	-67%
Claude Sonnet 4	Anthropic	$3.00	$15.00	200K	NEW
Grok 4.3	xAI	$1.25	$2.50	1M	-58%
GPT-5	OpenAI	$1.25	$10.00	272K	NEW
Llama 4 Maverick	Together	$0.30	$0.90	1M	NEW
Claude 4 Opus	Anthropic	$15.00	$75.00	200K	NEW
GPT-5.5	OpenAI	$5.00	$30.00	256K	NEW

Full pricing for all 42 models available on our Pricing Index Model Matrix page.

Q1 → Q2 Price Changes

The biggest story of Q2 2026 is price compression at the bottom. Budget models are getting cheaper and better, while premium models are holding steady or dropping slightly.

Biggest Price Drops

Gemini 2.0 Flash: -50% — Google's aggressive pricing on their fast model
Gemini 2.5 Pro: -67% — From $3.75 to $1.25 input, making it competitive with mid-tier models
DeepSeek V4 Flash: New entry at $0.14/$0.28 — cheapest model with 128K context

New Entries

GPT-5.5: OpenAI's latest flagship at $5/$30
Claude 4 Opus: Anthropic's premium at $15/$75 (the most expensive model)
Llama 4 Scout & Maverick: Meta's open-source models on Together.ai
Kimi K2.6: Moonshot's budget contender at $0.95/$4.00
Grok 4.3 & Build 0.1: xAI's models at $1.25/$2.50 and $0.30/$0.50

Provider Market Share

Based on our usage data and community surveys:

38%

OpenAI

25%

Anthropic

18%

Google

19%

Others

OpenAI still leads, but Anthropic and Google are gaining ground fast. DeepSeek and Together.ai are the fastest-growing providers, driven by aggressive pricing.

Cost per Use Case

Here's what you'll actually pay for common use cases, based on 100K requests/month:

Use Case	Cheapest Option	Best Value	Premium Pick
Chatbot	Mistral Small 4: $8/mo	GPT-4o mini: $15/mo	Claude Sonnet 4: $45/mo
Code Generation	GPT-4o mini: $15/mo	Claude Sonnet 4: $45/mo	GPT-5: $200/mo
Content Writing	Gemini Flash: $6/mo	GPT-4o: $75/mo	Claude Opus: $300/mo
Data Extraction	Mistral Small 4: $8/mo	GPT-4o mini: $15/mo	Gemini 2.5 Pro: $55/mo
Customer Support	DeepSeek Flash: $7/mo	GPT-4o mini: $15/mo	Claude Sonnet 4: $45/mo

Optimization Strategies That Work

Model routing: Route 80% of traffic to budget models, 20% to premium. Average savings: 45%.
Prompt compression: Reduce input tokens by 40% through better prompts. At scale, this saves $500+/month.
Response caching: Cache 20-30% of identical requests. Near-zero marginal cost.
Batch processing: Non-urgent tasks use batch APIs at 50% discount.
Multi-provider strategy: Use 2-3 providers and route based on price + quality.

The most expensive model is not always the best choice. A $0.15 model that works 90% of the time is cheaper than a $10 model that works 100% of the time — if you handle the 10% edge cases separately.

Predictions for H2 2026

Budget models will reach $0.05/input — DeepSeek and Llama 4 are driving prices down fast.
Premium models will stabilize — GPT-5, Claude 4 Opus, and GPT-5.5 won't drop much further.
Open-source will grow — Llama 4, Mistral, and new entrants will capture 30%+ of the market.
Model routing will become standard — Every serious AI app will use 2+ models.
Embedding costs will drop 50% — Competition in the embedding space is just starting.

Track these prices in real-time.

APIpulse tracks all 42 models across 10 providers. Get notified when prices change.

Try the Free Calculator

Want to save scenarios and export cost reports from this data?

APIpulse Pro lets you save up to 10 configurations, export professional HTML reports, and get personalized optimization recommendations.

Get Pro — $29 one-time

Get notified when API prices change

No spam. Only pricing updates and new features. Unsubscribe anytime.