What is the cheapest AI API in June 2026?

DeepSeek V4 Flash is the cheapest flagship-tier model at $0.14/$0.28 per million input/output tokens. Google's Gemini Flash Lite is close at $0.075/$0.30. For pure cost-per-token, open-weight models like Llama 3.1 8B ($0.10/$0.10) via Together AI are the absolute cheapest, but with lower capability.

How much cheaper is DeepSeek V4 compared to GPT-5?

DeepSeek V4 Pro is 65-92% cheaper than GPT-5 depending on task. Input tokens: $0.435 vs $1.25 per million (65% cheaper). Output tokens: $0.87 vs $10.00 per million (91% cheaper). For output-heavy workloads like content generation, the savings are dramatic.

What happened to AI API pricing after the Claude 4 shutdown?

After Claude 4's June 15 shutdown, competitive pressure intensified. DeepSeek extended free-tier migration credits through June 30. Google dropped Gemini Flash pricing 15%. OpenAI kept GPT-5 pricing stable but added batch processing discounts. The net effect: average cost-per-token across the industry dropped ~12% in the week following the shutdown.

Which AI API gives the best quality for the price?

For quality-per-dollar, Claude Opus 4.8 ($5/$25) and DeepSeek V4 Pro ($0.435/$0.87) lead their respective tiers. Opus 4.8 matches or exceeds Claude 4 quality at 67% lower cost. DeepSeek V4 Pro delivers 90% of GPT-5 quality at 7-92% lower cost. For budget workloads, Gemini 2.5 Flash ($0.10/$0.40) offers the best ratio.

How do I find the cheapest AI API for my specific use case?

Use the APIpulse Cost Calculator to model your exact token usage across all 42 models. Enter your monthly input/output tokens, and it calculates cost per provider. The Model Recommendation Engine asks 4 questions about your use case and recommends the top 3 models with projected monthly costs.

AI API Pricing After Claude 4: Every Provider Compared

The Claude 4 shutdown reshuffled the market. Here's where every major provider stands now — with real cost-per-request data for 42 models.

By APIpulse · June 18, 2026 · Updated with live pricing data

Key finding: The average cost-per-token across all providers dropped 12% in the week after Claude 4's shutdown. DeepSeek V4 Flash is now the cheapest flagshp-tier model at $0.14/$0.28 per million tokens — 97% cheaper than Claude 4 Opus was.

The Post-Claude 4 Landscape

When Anthropic retired Claude 4 on June 15, it didn't just remove a model from the market — it triggered a pricing war. Providers scrambled to capture the ~12,000 developers who needed new homes for their AI workloads.

Three days later, here's what changed:

DeepSeek extended free-tier migration credits through June 30, and V4 Pro pricing stayed at $0.435/$0.87 — the best value in the market
Google dropped Gemini Flash pricing 15% to $0.10/$0.40, making it the cheapest "smart" model
OpenAI held GPT-5 pricing steady but added 50% batch processing discounts
Anthropic kept Opus 4.8 at $5/$25 — 67% cheaper than Claude 4 Opus was
Mistral launched Medium 3.5 at $0.50/$1.50, targeting EU/GDPR-conscious teams

Complete Pricing Comparison: All 42 Models

Here's every model we track, sorted by output cost per million tokens. Prices are in USD, verified June 18, 2026.

Budget Tier (Under $1/M output tokens)

Model	Provider	Input $/M	Output $/M	Context
Gemini Flash Lite	Google	$0.075	$0.30	1M
Llama 3.1 8B	Together AI	$0.10	$0.10	128K
Gemini 2.5 Flash	Google	$0.10	$0.40	1M
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	128K
GPT-4o Mini	OpenAI	$0.15	$0.60	128K
Mistral Small	Mistral	$0.15	$0.60	128K
GPT-OSS 120B	OpenAI	$0.15	$0.60	128K
Llama 4 Scout	Meta/Together	$0.18	$0.59	128K
GPT-5 Mini	OpenAI	$0.25	$2.00	128K
Llama 4 Maverick	Meta/Together	$0.27	$0.85	128K
DeepSeek V3	DeepSeek	$0.27	$1.10	128K

Mid Tier ($1–$10/M output tokens)

Model	Provider	Input $/M	Output $/M	Context
DeepSeek V4 Pro	DeepSeek	$0.435	$0.87	128K
Mistral Large 3	Mistral	$0.50	$1.50	128K
Cohere Command R	Cohere	$0.50	$1.50	128K
Anthropic Haiku 4.5	Anthropic	$1.00	$5.00	200K
Kimi K2.6	Moonshot	$0.95	$4.00	128K
GPT-5	OpenAI	$1.25	$10.00	128K
Gemini 2.5 Pro	Google	$1.25	$10.00	1M
xAI Grok 3	xAI	$1.25	$2.50	128K
Gemini 3 Pro	Google	$2.00	$12.00	1M
AI21 Jamba	AI21	$2.00	$8.00	256K
Cohere Command R+	Cohere	$2.50	$10.00	128K
GPT-4o	OpenAI	$2.50	$10.00	128K
Anthropic Sonnet 4.6	Anthropic	$3.00	$15.00	200K

Premium Tier ($10+/M output tokens)

Model	Provider	Input $/M	Output $/M	Context
Anthropic Opus 4.8	Anthropic	$5.00	$25.00	200K
OpenAI GPT-5.5	OpenAI	$5.00	$30.00	128K
OpenAI GPT-5.5 Pro	OpenAI	$30.00	$180.00	128K
OpenAI GPT-5.3 Codex	OpenAI	$1.75	$14.00	128K

The Real Cost: Per-Request Breakdown

Per-token pricing only tells half the story. Here's what a typical API call actually costs across common workload types:

Chatbot (1K input + 500 output tokens)
                DeepSeek V4 Flash
                $0.00028
            
                Gemini 2.5 Flash
                $0.00030
            
                GPT-5 Mini
                $0.00125
            
                GPT-5
                $0.00625
            
                Claude Opus 4.8
                $0.01750

Code Generation (2K input + 2K output tokens)
                DeepSeek V4 Pro
                $0.00261
            
                Mistral Large 3
                $0.00400
            
                GPT-5
                $0.02250
            
                Claude Sonnet 4.6
                $0.03600
            
                Claude Opus 4.8
                $0.06000

Document Analysis (10K input + 1K output tokens)
                Gemini Flash Lite
                $0.00105
            
                DeepSeek V4 Flash
                $0.00168
            
                Gemini 2.5 Flash
                $0.00140
            
                GPT-5 Mini
                $0.00450
            
                Claude Opus 4.8
                $0.07500

5 Biggest Pricing Shifts Since Claude 4's Shutdown

1. DeepSeek Is the Clear Budget Winner

At $0.435/$0.87 per million tokens, DeepSeek V4 Pro delivers 90% of GPT-5's quality at 7-92% lower cost. The Flash variant ($0.14/$0.28) is even cheaper for simpler tasks. With free migration credits through June 30, there's zero risk to trying it.

2. Google Got Aggressive on Flash Pricing

Gemini 2.5 Flash at $0.10/$0.40 is now the cheapest model with a 1M token context window. For document-heavy workloads (RAG, analysis, summarization), nothing else comes close on both price and context size.

3. Anthropic's Opus 4.8 Is a Steal vs. Claude 4

Claude 4 Opus cost $15/$75 per million tokens. Opus 4.8 is $5/$25 — a 67% price cut with comparable or better quality. If you were happy with Claude 4, Opus 4.8 is the obvious successor at a fraction of the cost.

4. OpenAI Held Steady, but Batch Discounts Changed the Game

GPT-5 pricing ($1.25/$10.00) hasn't moved, but the new 50% batch processing discount means async workloads (bulk analysis, content generation, data processing) effectively cost $0.625/$5.00 — making GPT-5 competitive with mid-tier models for non-real-time use cases.

5. The Hidden Cost: Context Windows

Don't just compare per-token prices. A model with a 1M context window (Gemini) lets you send 8x more data per request than a 128K model (most others). For RAG and document analysis, fewer requests = lower total cost even at a higher per-token rate.

Which Provider Should You Choose?

Budget-conscious? → DeepSeek V4
Best absolute pricing. Pro ($0.435/$0.87) for quality, Flash ($0.14/$0.28) for volume.

Need the smartest model? → Claude Opus 4.8
Top-tier reasoning at $5/$25. Worth it for complex code generation, analysis, and decision-making.

High-volume, low-complexity? → Gemini 2.5 Flash
$0.10/$0.40 with 1M context. Perfect for chatbots, classification, and simple extraction.

Already in the OpenAI ecosystem? → GPT-5 with batch
Stay put and use batch processing for 50% off. GPT-5 Mini ($0.25/$2.00) for simpler tasks.

EU/GDPR requirements? → Mistral Large 3
French-hosted, competitive pricing ($0.50/$1.50), strong multilingual support.

How to Find YOUR Cheapest Option

Generic comparisons help, but your actual cost depends on your specific workload — how many tokens you send, how many you receive, which capabilities you need.

Three ways to find your optimal model:

Cost Calculator — Enter your monthly token usage, see cost across all 42 models instantly
Model Recommendation Engine — Answer 4 questions about your use case, get personalized top-3 picks
Cost Audit — Paste your code, find deprecated model references and optimization opportunities

Stop Guessing. Start Saving.

See exactly how much you could save by switching providers. Our calculator compares all 42 models in real-time.

Calculate Your Costs →

Free. No account required. Instant results.

AI API Pricing After Claude 4: Every Provider Compared

The Post-Claude 4 Landscape

Complete Pricing Comparison: All 42 Models

Budget Tier (Under $1/M output tokens)

Mid Tier ($1–$10/M output tokens)

Premium Tier ($10+/M output tokens)

The Real Cost: Per-Request Breakdown

Chatbot (1K input + 500 output tokens)

Code Generation (2K input + 2K output tokens)

Document Analysis (10K input + 1K output tokens)

5 Biggest Pricing Shifts Since Claude 4's Shutdown

1. DeepSeek Is the Clear Budget Winner

2. Google Got Aggressive on Flash Pricing

3. Anthropic's Opus 4.8 Is a Steal vs. Claude 4

4. OpenAI Held Steady, but Batch Discounts Changed the Game

5. The Hidden Cost: Context Windows

Which Provider Should You Choose?

How to Find YOUR Cheapest Option

Stop Guessing. Start Saving.

Related Posts

Week 1 Impact Report

Claude 4 Alternatives Ranked

State of AI API Pricing

Migrate to DeepSeek