📅 May 30, 2026 ⏱️ 9 min read 🏷️ Pricing Report

AI API Pricing June 2026: Complete Guide to All 34 Models

34 models. 10 providers. Prices from $0.075/M to $180/M. Two models retiring June 15. Here's everything you need to know about AI API pricing right now.

June 2026 marks a turning point in AI API pricing. Budget models now start under $0.10 per million tokens. Two legacy Anthropic models are being retired mid-month. And the gap between mid-tier and premium quality has narrowed to the point where most teams can save 60-80% by choosing wisely.

This guide covers every AI API model's current pricing, what's changing in June, the models you should migrate away from, and where the best deals are right now.

34 Models Available
10 Providers
$0.075 Cheapest / 1M tokens
2 Models retiring June 15

⚠️ Deprecation Alert: 2 Models Retiring June 15, 2026

Claude 4 Opus ($15/$75 per 1M tokens) and Claude Sonnet 4 ($3/$15 per 1M tokens) are being retired on June 15, 2026. If you're using either model, migrate now:

  • Claude 4 Opus → Claude Opus 4.7 or 4.8 ($5/$25) — 67% cheaper, better quality, 1M context
  • Claude Sonnet 4 → Claude Sonnet 4.6 ($3/$15) — same price, better quality, 1M context

Complete Pricing: All 34 Models

Every major AI API model ranked by input price. All prices per 1 million tokens, verified May 29, 2026.

Budget Tier — Under $0.60/1M Input

These models handle most everyday tasks at rock-bottom prices. If you're building a chatbot, classifier, or content tool, start here.

Model Provider Input / 1M Output / 1M Context
Gemini 2.0 Flash Lite Google $0.075 $0.30 1M
GPT-oss 20B OpenAI $0.08 $0.35 128K
Llama 3.1 8B Meta (Together.ai) $0.10 $0.10 128K
Gemini 2.0 Flash Google $0.10 $0.40 1M
Llama 4 Scout Meta (Together.ai) $0.11 $0.34 10M
DeepSeek V4 Flash DeepSeek $0.14 $0.28 1M
GPT-4o mini OpenAI $0.15 $0.60 128K
GPT-oss 120B OpenAI $0.15 $0.60 128K
Mistral Small 4 Mistral $0.15 $0.60 128K
Llama 4 Maverick Meta (Together.ai) $0.20 $0.60 10M
GPT-5 mini OpenAI $0.25 $2.00 272K
DeepSeek V3 DeepSeek $0.27 $1.10 128K
DeepSeek V4 Pro DeepSeek $0.44 $0.87 1M
Mistral Large 3 Mistral $0.50 $1.50 128K
Command R Cohere $0.50 $1.50 128K
Grok Build 0.1 xAI $0.30 $0.50 256K

Mid Tier — $0.50–$3.00/1M Input

The sweet spot for production workloads. Strong reasoning at reasonable prices.

Model Provider Input / 1M Output / 1M Context
Llama 3.1 70B Meta (Together.ai) $0.88 $0.88 128K
Kimi K2.6 Moonshot $0.95 $4.00 256K
Claude Haiku 4.5 Anthropic $1.00 $5.00 200K
Gemini 2.5 Pro Google $1.25 $10.00 1M
GPT-5 OpenAI $1.25 $10.00 272K
GPT-5.3 Codex OpenAI $1.75 $14.00 400K
Gemini 3.1 Pro Google $2.00 $12.00 1M
Jamba 1.5 Large AI21 $2.00 $8.00 256K
GPT-4o OpenAI $2.50 $10.00 128K
Command R+ Cohere $2.50 $10.00 128K
Claude Sonnet 4.6 Anthropic $3.00 $15.00 1M
Claude Sonnet 4 ⚠️ Anthropic $3.00 $15.00 200K
Grok 4.3 xAI $1.25 $2.50 1M

Premium Tier — $5.00+/1M Input

For complex reasoning, code generation, and high-stakes tasks where quality is non-negotiable.

Model Provider Input / 1M Output / 1M Context
Claude Opus 4.8 Anthropic $5.00 $25.00 1M
Claude Opus 4.7 Anthropic $5.00 $25.00 1M
GPT-5.5 OpenAI $5.00 $30.00 1M
Claude 4 Opus ⚠️ Anthropic $15.00 $75.00 200K
GPT-5.5 Pro OpenAI $30.00 $180.00 1M

What Changed Since May 2026

1. Anthropic's Deprecation Wave

The biggest change in June: two Anthropic models are being retired on June 15. Claude 4 Opus and Claude Sonnet 4 are being sunset in favor of the newer Opus 4.7/4.8 and Sonnet 4.6. The replacements are the same price or cheaper, with larger context windows. There's no reason to stay on the deprecated models.

2. Budget Floor Keeps Dropping

Gemini 2.0 Flash Lite at $0.075/M remains the cheapest production-ready AI API. That's 7.5 cents per million input tokens — you'd need to process 133,000 tokens to spend a single penny. A year ago, the cheapest comparable model was $0.15/M.

3. Context Windows Expanded

Seven models now support 1M+ token context windows:

4. Open Source Is Production-Ready

Meta's Llama 4 models via Together.ai are now legitimate production options. Llama 4 Scout at $0.11/M with a 10M context window is unmatched for long-document tasks. Llama 4 Maverick at $0.20/M handles general-purpose workloads well.

Best Deals by Use Case

Use Case Best Model Why
High-volume chatbot Gemini 2.0 Flash Lite Cheapest at $0.075/M, handles most chat tasks
Quality chatbot Claude Haiku 4.5 $1/M with Anthropic's quality
Code generation Claude Sonnet 4.6 Best code quality at $3/M, 1M context
Document analysis Gemini 2.5 Pro 1M context at $1.25/M
Classification / extraction GPT-4o mini $0.15/M, fast, reliable structured output
RAG / retrieval DeepSeek V4 Flash $0.14/M with 1M context
Content writing GPT-5 mini $0.25/M, strong writing at budget price
Complex reasoning Claude Opus 4.7 Best reasoning quality at $5/M
Agent / multi-step GPT-5 $1.25/M, strong tool use, 272K context
Long documents (10M+) Llama 4 Scout Only model with 10M context at $0.11/M
Budget all-around DeepSeek V4 Pro $0.44/M with 1M context — best value pick

What $100/Month Gets You in June 2026

Assuming 1,000 tokens per request with a 50/50 input/output split:

Tier Model Requests for $100 Daily Average
Budget Gemini 2.0 Flash Lite ~571,000 ~19,000/day
Budget DeepSeek V4 Flash ~476,000 ~15,900/day
Budget Llama 4 Scout ~434,000 ~14,500/day
Mid Claude Haiku 4.5 ~62,500 ~2,100/day
Mid GPT-5 ~30,800 ~1,030/day
Mid Claude Sonnet 4.6 ~22,200 ~740/day
Premium Claude Opus 4.7 ~8,000 ~267/day
Premium GPT-5.5 ~7,700 ~257/day

The range: 19,000 requests/day to 257 requests/day for the same $100 budget. Model selection is the single biggest cost lever you have.

Migration Guide: Models Retiring June 15

Migrating from Claude 4 Opus

Current: Claude 4 Opus — $15/$75 per 1M tokens, 200K context

Replace with:

  • Claude Opus 4.7 ($5/$25, 1M context) — 67% cheaper, better quality
  • Claude Opus 4.8 ($5/$25, 1M context) — latest version, best reasoning
  • Claude Sonnet 4.6 ($3/$15, 1M context) — 80% cheaper, sufficient for most tasks

Migrating from Claude Sonnet 4

Current: Claude Sonnet 4 — $3/$15 per 1M tokens, 200K context

Replace with:

  • Claude Sonnet 4.6 ($3/$15, 1M context) — same price, better quality, 5x context

Provider Comparison at a Glance

Provider Models Cheapest Most Expensive Best For
OpenAI 9 $0.08/M $180/M Widest range, agents
Anthropic 6 $1.00/M $75/M Code, reasoning
Google 4 $0.075/M $12/M Budget, long context
DeepSeek 3 $0.14/M $1.10/M Budget all-around
Meta (Together.ai) 4 $0.10/M $0.88/M Open source, 10M context
Mistral 2 $0.15/M $1.50/M European compliance
Cohere 2 $0.50/M $10/M RAG, enterprise search
Moonshot 1 $0.95/M $4.00/M Long context (256K)
xAI 2 $3.00/M $150/M Real-time data
AI21 1 $2.00/M $8/M Long context (256K)

What to Watch in July 2026

Methodology

All pricing data comes from official provider pricing pages, verified as of May 29, 2026. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.

Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported. Some providers offer batch pricing or committed-use discounts not reflected here.

Calculate your exact costs

Use our free tools to see what these prices mean for your workload. No signup required.

Open Cost Calculator →

Related Tools

← May 2026 Pricing Report Cheapest AI API June 2026 →