What is the cheapest AI API in May 2026?

The cheapest AI API in May 2026 is Gemini 2.5 Flash-Lite at $0.075/$0.30 per 1M tokens (input/output). DeepSeek V4 Flash ($0.14/$0.28) and GPT-4o mini ($0.15/$0.60) are also budget-friendly. For context-aware tasks, Gemini 2.5 Flash-Lite ($0.10/$0.40) with a 1M context window offers the best value.

How many AI API models are available in 2026?

As of July 2026, there are 67 major AI API models available across 10 providers: OpenAI (9 models), Anthropic (5), Google (7), DeepSeek (3), Mistral (3), Cohere (2), Meta/Together.ai (4), Moonshot (1), xAI (2), and AI21 (1). Prices range from $0.075/M to $180/M input tokens.

Are AI API prices going down in 2026?

Yes, AI API prices have dropped dramatically. Since 2023, prices have fallen by an average of 90%. Budget models that cost $0.15/M in 2024 now cost $0.075/M. Premium models have also gotten cheaper — GPT-4 class models went from $30/$60 to $2.50/$10 per 1M tokens. The trend continues in 2026 with new budget models undercutting existing prices.

Which AI provider has the most models?

OpenAI offers the most models with 9 options ranging from $0.08/M (GPT-oss 20B) to $30/$180/M (GPT-5.5 Pro). Meta/Together.ai has 4 Llama models, Anthropic has 5 Claude models, and Google has 4 Gemini models. The open-source ecosystem via Together.ai gives developers the most model variety.

AI API Pricing Report: May 2026 — Every Model, Every Provider

AI API pricing in 2026 looks nothing like it did a year ago. Budget models now cost less than $0.10 per million tokens. Premium models have halved in price. And the number of available models has exploded to 34 across 10 providers.

This report covers every major AI API model's current pricing, the trends driving prices down, the best deals in each tier, and what to watch for in the months ahead.

Updated Jul 9, 2026: xAI rebranded Grok 3 → Grok 4.3 ($1.25/$2.50, down from $30/$150) and Grok 3 Mini → Grok Build 0.1 ($1.00/$2.00). Pricing tables below reflect May 2026 data. See xAI pricing for current data.

34 Models Available

10 Providers

$0.075 Cheapest / 1M tokens

90% Avg price drop since 2023

The Complete Pricing Landscape

Here's every major AI API model ranked by input price. All prices are per 1 million tokens.

Budget Tier (Under $0.60/1M input)

These models handle most everyday tasks — chatbots, classification, summarization, content generation — at rock-bottom prices.

Model	Provider	Input / 1M	Output / 1M	Context
Gemini 2.5 Flash-Lite	Google	$0.075	$0.30	1M
GPT-oss 20B	OpenAI	$0.08	$0.35	128K
Llama 3.1 8B	Meta (Together.ai)	$0.10	$0.10	128K
Gemini 2.5 Flash-Lite	Google	$0.10	$0.40	1M
Llama 4 Scout	Meta (Together.ai)	$0.11	$0.34	10M
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1M
GPT-4o mini	OpenAI	$0.15	$0.60	128K
GPT-oss 120B	OpenAI	$0.15	$0.60	128K
Mistral Small 4	Mistral	$0.15	$0.60	128K
Llama 4 Maverick	Meta (Together.ai)	$0.20	$0.60	10M
GPT-5 mini	OpenAI	$0.25	$2.00	272K
DeepSeek V3	DeepSeek	$0.27	$1.10	128K
DeepSeek V4 Pro	DeepSeek	$0.44	$0.87	1M
Mistral Large 3	Mistral	$0.50	$1.50	128K
Command R	Cohere	$0.50	$1.50	128K

Mid Tier ($0.50–$3.00/1M input)

The sweet spot for production workloads. These models offer strong reasoning quality at reasonable prices.

Model	Provider	Input / 1M	Output / 1M	Context
Kimi K2.6	Moonshot	$0.90	$3.75	256K
Claude Haiku 4.5	Anthropic	$1.00	$5.00	200K
Gemini 2.5 Pro	Google	$1.25	$10.00	1M
GPT-5	OpenAI	$1.25	$10.00	272K
GPT-5.3 Codex	OpenAI	$1.75	$14.00	400K
Gemini 3.1 Pro	Google	$2.00	$12.00	1M
Jamba 1.5 Large	AI21	$2.00	$8.00	256K
GPT-4o	OpenAI	$2.50	$10.00	128K
Command R+	Cohere	$2.50	$10.00	128K
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	1M
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	200K
Grok 3 Mini	xAI	$3.00	$5.00	128K
Llama 3.1 70B	Meta (Together.ai)	$0.88	$0.88	128K

Premium Tier ($5.00+/1M input)

For complex reasoning, code generation, and high-stakes tasks where quality matters most.

Model	Provider	Input / 1M	Output / 1M	Context
Claude Opus 4.7	Anthropic	$5.00	$25.00	1M
GPT-5.5	OpenAI	$5.00	$30.00	1M
Claude 4 Opus	Anthropic	$15.00	$75.00	200K
Grok 3	xAI	$30.00	$150.00	128K
GPT-5.5 Pro	OpenAI	$30.00	$180.00	1M

Key Trends This Month

1. Budget Models Keep Getting Cheaper

The floor keeps dropping. Gemini 2.5 Flash-Lite at $0.075/M is now the cheapest production-ready AI API. That's 7.5 cents per million input tokens — less than a penny for 133,000 tokens of text. A year ago, the cheapest comparable model was $0.15/M.

What this means: if you're building a chatbot, classifier, or content tool that processes high volumes, your costs have halved in 12 months without changing anything.

2. Context Windows Are the New Battleground

While price wars grab headlines, the real shift is in context windows. Seven models now offer 1M+ token context:

Llama 4 Scout: 1M tokens — the largest context window available
Gemini 2.5 Flash-Lite/Flash Lite: 1M — at budget prices
Gemini 2.5 Pro, Gemini 3.1 Pro: 1M
Claude Opus 4.7, Sonnet 4.6: 1M
DeepSeek V4 Pro/Flash: 1M

A 1M context window means you can feed an entire codebase, a full legal document, or hours of conversation history into a single API call. This changes what's possible — and it's available at budget prices.

3. The Premium Tier Is Shrinking

Only 5 models cost $5+/M input. And the quality gap between mid-tier and premium is narrowing. Claude Sonnet 4.6 ($3/$15) and Gemini 3.1 Pro ($2/$12) now handle most tasks that required Opus or GPT-5.5 a few months ago.

The exception: complex multi-step reasoning, code generation in large codebases, and high-stakes analysis still benefit from premium models. But for 80% of production workloads, mid-tier is enough.

4. Open Source Is a Legitimate Option

Meta's Llama 4 models on Together.ai offer serious competition:

Llama 4 Scout ($0.18/$0.59) — cheapest model with a 1M context window
Llama 4 Maverick ($0.20/$0.60) — strong general-purpose model
Llama 3.1 70B ($0.88/$0.88) — balanced price/quality with symmetric pricing

For cost-sensitive applications where you control the prompt engineering, open-source models via Together.ai are hard to beat.

Best Deals by Use Case

Use Case	Best Model	Why
Chatbot (high volume)	Gemini 2.5 Flash-Lite	Cheapest at $0.075/M, handles most chat tasks
Chatbot (quality)	Claude Haiku 4.5	$1/M with Anthropic's quality
Code Generation	Claude Sonnet 4.6	Best code quality at $3/M, 1M context
Document Analysis	Gemini 2.5 Pro	1M context window at $1.25/M
Classification	GPT-4o mini	$0.15/M, fast, reliable for structured output
RAG / Retrieval	DeepSeek V4 Flash	$0.14/M with 1M context for long retrieval
Content Writing	GPT-5 mini	$0.25/M input, strong writing at budget price
Complex Reasoning	Claude Opus 4.7	Best reasoning quality, worth the $5/M premium
Agent / Multi-step	GPT-5	$1.25/M, strong tool use, 272K context
Budget Everything	DeepSeek V4 Pro	$0.44/M with 1M context — best all-around budget pick

Cost Comparison: What $100/Month Gets You

Here's how far $100 goes at different model tiers (assuming 1,000 tokens per request, 50/50 input/output split):

Tier	Model	Requests for $100	Daily Average
Budget	Gemini 2.5 Flash-Lite	~571,000	~19,000/day
Budget	DeepSeek V4 Flash	~476,000	~15,900/day
Budget	GPT-4o mini	~267,000	~8,900/day
Mid	Claude Haiku 4.5	~62,500	~2,100/day
Mid	Claude Sonnet 4.6	~22,200	~740/day
Mid	GPT-5	~30,800	~1,030/day
Premium	Claude Opus 4.7	~8,000	~267/day
Premium	GPT-5.5	~7,700	~257/day

The range is staggering: from 19,000 requests/day to 257 requests/day for the same $100 budget. Choosing the right model tier is the single biggest cost lever you have.

What to Watch in July 2026

Google I/O aftermath — new Gemini models or pricing changes could shift the budget tier
OpenAI's open-source push — GPT-oss models may see price cuts to compete with Llama 4
Anthropic's response — Claude Haiku 4.5 pricing may drop to match budget competitors
DeepSeek's next move — V4 Pro at $0.44/M is already aggressive; watch for V5 announcements
xAI's Grok 3 pricing — $30/$150 is the most expensive model; cuts are likely

Update: See our June 2026 AI API Pricing Guide for the latest prices, deprecation alerts, and migration recommendations.

Methodology

All pricing data in this report comes from official provider pricing pages, verified as of Jun 3, 2026. We track 67 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.

Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported by each model. Some providers offer batch pricing or committed-use discounts not reflected here.

Calculate your exact costs

Use our free tools to see what these prices mean for your specific workload. No signup required.

Open Cost Calculator →

Related Tools

AI API Cost Calculator — estimate costs for any model
Cost Explorer — see all 67 models ranked by cost
Model Compare — side-by-side model comparison
Pricing Index — complete sortable pricing database
Cheapest AI API Finder — find the lowest-cost option

🔌 Free MCP Server →

← Cost Projection Guide Claude API Pricing Guide →

🎯 Rate Your API Setup in 30 Seconds

Get an A+ to F grade on your AI API costs. See how you compare and find cheaper alternatives instantly.

Get Your Cost Score →

📊 Generate Your Personalized API Cost Report

Select your model, enter your monthly spend, and get a custom savings report with cheaper alternatives — free, in 60 seconds.

Want to optimize your AI API costs?

APIpulse includes free cost comparisons, exports, and recommendations that can save you up to 40%.

Free Cost Audit →