LLM Pricing Trends 2026

How AI API costs are changing — price drops, new models, deprecations, and what it means for your budget.

-75%
DeepSeek V4 Pro price drop
-96%
Grok 3 → 4.3 rebrand savings
$0.075
Cheapest model (Gemini Flash Lite)
34
Models tracked across 10 providers
Budget Model Pricing — Input Cost per 1M Tokens (June 2026)
Gemini Flash Lite
$0.08
$0.075
GPT-oss 20B
$0.08
$0.08
Llama 3.1 8B
$0.10
$0.10
Gemini Flash
$0.10
$0.10
DeepSeek V4 Flash
$0.14
$0.14
GPT-4o mini
$0.15
$0.15
Mistral Small 4
$0.15
$0.15
GPT-oss 120B
$0.15
$0.15

2026 Timeline

June 15, 2026
Claude 4 Opus & Sonnet Deprecated
Anthropic retiring Claude 4 Opus ($15/$75) and Claude Sonnet 4 ($3/$15). Replacements: Opus 4.8 ($5/$25) and Sonnet 4.6 ($3/$15). Up to 67% savings by migrating.
Deprecation — June 15
June 2026
xAI Grok 3 → Grok 4.3 (96% Price Drop)
Grok 3 rebranded to Grok 4.3 at $1.25/$2.50 — down from $30/$150. Grok 3 Mini → Grok Build 0.1 at $0.30/$0.50. Context expanded to 1M tokens.
Rebrand + Major Drop
June 2026
DeepSeek V4 Pro — 75% Price Cut
DeepSeek dropped V4 Pro from $1.75/$7.00 to $0.44/$0.87 per 1M tokens. Now cheaper than GPT-4o mini for many workloads.
Price Drop
May 2026
Claude Opus 4.8 & Sonnet 4.6 Launched
Anthropic released Opus 4.8 ($5/$25, 1M context) and Sonnet 4.6 ($3/$15, 1M context). Major context window upgrade from 200K to 1M tokens.
New Model
May 2026
GPT-5, GPT-5 mini, GPT-oss Launched
OpenAI launched GPT-5 ($1.25/$10), GPT-5 mini ($0.25/$2), GPT-oss 120B ($0.15/$0.60), and GPT-oss 20B ($0.08/$0.35). Budget tier significantly expanded.
New Models
May 2026
Gemini 3.1 Pro Launched
Google released Gemini 3.1 Pro at $2.00/$12.00 with 1M context. Replaces Gemini 3 Pro at the same price point.
New Model
May 2026
Mistral Large 3 & Small 4 Launched
Mistral released Large 3 ($0.50/$1.50) and Small 4 ($0.15/$0.60). Major price drops from previous Mistral models. 262K context for Large.
New Models + Price Drop
May 2026
Llama 4 Scout & Maverick Launched
Meta released Llama 4 Scout ($0.18/$0.59) and Maverick ($0.27/$0.85) via Together.ai. 1M context window. Open-source alternatives at budget pricing.
New Models
June 2026
Kimi K2.6 Launched
Moonshot released Kimi K2.6 at $0.95/$4.00 with 256K context. New entrant in the mid-tier segment.
New Model

Key Trends

Budget Tier Explosion

The sub-$0.50/input tier has grown from 3 models in January to 12+ models in June. DeepSeek, OpenAI (GPT-oss), Meta (Llama 4), and Google (Flash Lite) are all competing for the cheapest model slot. Developers can now run production workloads for under $10/month.

1M Context Is Standard

Context windows have expanded dramatically. Gemini, Claude Opus 4.8, Sonnet 4.6, Grok 4.3, and Llama 4 all offer 1M+ token context. This enables full codebase analysis, long document processing, and multi-turn conversations without chunking.

Multi-Model Routing Saves 50-70%

Using one expensive model for everything is wasteful. Route simple tasks (chat, classification) to budget models and reserve premium models for complex reasoning. The AI Stack Cost Optimizer shows you exactly how much you can save.

Deprecation Speed Is Increasing

Models are being deprecated faster than ever. Claude 4 Opus and Sonnet 4 are retiring June 15 — just months after launch. Set up deprecation alerts to avoid breaking changes in production.

Notable Price Changes in 2026

Model Provider Previous Price (Input/Output) Current Price Change
Grok 4.3 xAI $30.00 / $150.00 (as Grok 3) $1.25 / $2.50 -96%
DeepSeek V4 Pro DeepSeek $1.75 / $7.00 $0.44 / $0.87 -75%
Claude Opus 4.8 Anthropic $15.00 / $75.00 (as Claude 4 Opus) $5.00 / $25.00 -67%
Mistral Large 3 Mistral $2.00 / $6.00 (as Large 2) $0.50 / $1.50 -75%
GPT-5 mini OpenAI New model $0.25 / $2.00 Launch
Gemini Flash Lite Google New model $0.075 / $0.30 Launch
Llama 4 Scout Meta New model $0.18 / $0.59 Launch

Track Pricing Changes in Real Time

Get notified when providers change their pricing. Never get surprised by a price hike or miss a deprecation deadline.

Related Reading