What is the cheapest AI API in September 2026?

The cheapest AI API in September 2026 is Gemini 2.5 Flash-Lite at $0.075 per 1M input tokens and $0.30 per 1M output tokens. Other budget options include GPT-oss 20B ($0.08/$0.35), Llama 3.1 8B ($0.10/$0.10), and DeepSeek V4 Flash ($0.14/$0.28).

How many AI API models are available in September 2026?

As of September 2026, there are 32 major AI API models available across 10 providers: OpenAI (9 models), Anthropic (4), Google (4), DeepSeek (3), Mistral (2), Cohere (2), Meta/Together.ai (4), Moonshot (1), xAI (2), and AI21 (1). Prices range from $0.075/M to $180/M input tokens.

Are AI API prices expected to change in Q4 2026?

Q4 2026 is likely to see major shifts. OpenAI typically launches new generations in Q4 (GPT-6 rumors), Google may release Gemini 3.0 Flash, and DeepSeek V5 could disrupt the mid-tier. Budget prices have dropped ~50% since mid-2025 and further compression is expected. Lock in current rates with committed-use discounts if your provider offers them.

What is the best AI API for code generation in September 2026?

Claude Sonnet 4.6 ($3/$15 per 1M tokens, 1M context) remains the best code generation model in September 2026. For budget code tasks, GPT-5.3 Codex ($1.75/$14) with 400K context is a strong alternative. For free/open-source options, Llama 4 Scout (1M context)) via Together.ai handles basic code generation.

AI API Pricing September 2026: Complete Guide to All 59 Models

September 2026 brings the third full month of the post-deprecation era. The 67-model market continues to stabilize, Q3 trends are now firmly established, and the big question is what Q4 will bring. OpenAI, Google, and Anthropic historically make their biggest announcements in Q4 — and the pricing landscape could shift dramatically.

This guide covers every AI API model's current pricing, the established Q3 landscape, and what to watch as we head into the final quarter of 2026.

49 Models Available

10 Providers

$0.075 Cheapest / 1M tokens

0 Active Deprecations

📊 Market Stability: Three Months Post-Deprecation

Three months after Claude 4 Opus and Claude Sonnet 4 were retired, the market is firmly settled. Claude Opus 4.7/4.8 ($5/$25) and Claude Sonnet 4.6 ($3/$15) are established as the replacements. No major deprecations are expected through Q3. But Q4 is the traditional launch window — start planning your migration strategy now.

Complete Pricing: All 59 Models

Every major AI API model ranked by input price. All prices per 1 million tokens.

Budget Tier — Under $0.60/1M Input

Rock-bottom prices for everyday tasks. Chatbots, classifiers, content tools — start here.

Model	Provider	Input / 1M	Output / 1M	Context
Gemini 2.5 Flash-Lite	Google	$0.075	$0.30	1M
GPT-oss 20B	OpenAI	$0.08	$0.35	128K
Llama 3.1 8B	Meta (Together.ai)	$0.10	$0.10	128K
Gemini 2.5 Flash-Lite	Google	$0.10	$0.40	1M
Llama 4 Scout	Meta (Together.ai)	$0.18	$0.34	10M
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1M
GPT-4o mini	OpenAI	$0.15	$0.60	128K
GPT-oss 120B	OpenAI	$0.15	$0.60	128K
Mistral Small 4	Mistral	$0.15	$0.60	128K
Llama 4 Maverick	Meta (Together.ai)	$0.20	$0.60	10M
GPT-5 mini	OpenAI	$0.25	$2.00	272K
DeepSeek V4 Pro	DeepSeek	$0.44	$0.87	1M
Mistral Large 3	Mistral	$0.50	$1.50	128K
Command R	Cohere	$0.50	$1.50	128K
Grok Build 0.1	xAI	$0.30	$0.50	256K

Mid Tier — $0.50–$3.00/1M Input

The sweet spot for production workloads. Strong reasoning at reasonable prices.

Model	Provider	Input / 1M	Output / 1M	Context
Llama 3.1 70B	Meta (Together.ai)	$0.88	$0.88	128K
Kimi K2.6	Moonshot	$0.90	$3.75	256K
Claude Haiku 4.5	Anthropic	$1.00	$5.00	200K
Gemini 2.5 Pro	Google	$1.25	$10.00	1M
GPT-5	OpenAI	$1.25	$10.00	272K
GPT-5.3 Codex	OpenAI	$1.75	$14.00	400K
Gemini 3.1 Pro	Google	$2.00	$12.00	1M
Jamba 1.5 Large	AI21	$2.00	$8.00	256K
GPT-4o	OpenAI	$2.50	$10.00	128K
Command R+	Cohere	$2.50	$10.00	128K
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	1M
Grok 4.3	xAI	$1.25	$2.50	1M

Premium Tier — $5.00+/1M Input

For complex reasoning, code generation, and high-stakes tasks where quality is non-negotiable.

Model	Provider	Input / 1M	Output / 1M	Context
Claude Opus 4.8	Anthropic	$5.00	$25.00	1M
Claude Opus 4.7	Anthropic	$5.00	$25.00	1M
GPT-5.5	OpenAI	$5.00	$30.00	1M
GPT-5.5 Pro	OpenAI	$30.00	$180.00	1M

Q3 2026 Pricing Trends

1. The Market Has Fully Stabilized

Three months after the Claude 4 Opus and Claude Sonnet 4.6 deprecation, September shows a calm, predictable market. No major model retirements are expected through the rest of Q3. The 64-model landscape is the established order — but Q4 is historically when providers reset the curve.

2. Budget Tier Is Packed

The sub-$0.15/M tier remains crowded with 6 models competing for the budget crown. Gemini 2.5 Flash-Lite ($0.075/M) still holds the cheapest spot. The 133,000-tokens-per-penny threshold hasn't been broken since July — it may take a new generation model to push below $0.05/M.

3. Mid-Tier Continues to Deliver

The $1–3/M tier remains the sweet spot for production workloads. Claude Haiku 4.5 ($1/M) continues to handle 90% of tasks that used to require premium models. Claude Sonnet 4.6 ($3/M) with 1M context remains the top code generation model. Going premium ($5+/M) is only justified for the most complex reasoning tasks.

4. Open Source Steady, Awaiting Llama 5

Meta's Llama 4 Scout (1M context)) still holds the longest-context crown. No new Llama releases in Q3 — all eyes are on Q4/early 2027 for Llama 5. Together.ai's managed hosting continues to make open-source models zero-ops.

5. xAI: The Outlier

Grok 4.3 at $1.25/$2.50 per 1M tokens is now competitive after xAI's rebrand and repricing. Grok Build 0.1 ($1.00/$2.00) joins the budget tier. The old Grok 3 ($30/$150) has been retired.

Best Deals by Use Case

Use Case	Best Model	Why
High-volume chatbot	Gemini 2.5 Flash-Lite	Cheapest at $0.075/M, handles most chat tasks
Quality chatbot	Claude Haiku 4.5	$1/M with Anthropic's quality
Code generation	Claude Sonnet 4.6	Best code quality at $3/M, 1M context
Document analysis	Gemini 2.5 Pro	1M context at $1.25/M
Classification / extraction	GPT-4o mini	$0.15/M, fast, reliable structured output
RAG / retrieval	DeepSeek V4 Flash	$0.14/M with 1M context
Content writing	GPT-5 mini	$0.25/M, strong writing at budget price
Complex reasoning	Claude Opus 4.7	Best reasoning quality at $5/M
Agent / multi-step	GPT-5	$1.25/M, strong tool use, 272K context
Long documents (10M+)	Llama 4 Scout	Only model with 1M context at $0.18/M
Budget all-around	DeepSeek V4 Pro	$0.44/M with 1M context — best value pick

What $100/Month Gets You in September 2026

Assuming 1,000 tokens per request with a 50/50 input/output split:

Tier	Model	Requests for $100	Daily Average
Budget	Gemini 2.5 Flash-Lite	~571,000	~19,000/day
Budget	DeepSeek V4 Flash	~476,000	~15,900/day
Budget	Llama 4 Scout	~434,000	~14,500/day
Mid	Claude Haiku 4.5	~62,500	~2,100/day
Mid	GPT-5	~30,800	~1,030/day
Mid	Claude Sonnet 4.6	~22,200	~740/day
Premium	Claude Opus 4.7	~8,000	~267/day
Premium	GPT-5.5	~7,700	~257/day

The range: 19,000 requests/day to 257 requests/day for the same $100 budget. Model selection is the single biggest cost lever you have.

Provider Comparison at a Glance

Provider	Models	Cheapest	Most Expensive	Best For
OpenAI	9	$0.08/M	$180/M	Widest range, agents
Anthropic	4	$1.00/M	$25/M	Code, reasoning
Google	4	$0.075/M	$12/M	Budget, long context
DeepSeek	3	$0.14/M	$1.10/M	Budget all-around
Meta (Together.ai)	4	$0.10/M	$0.88/M	Open source, 1M context
Mistral	2	$0.15/M	$1.50/M	European compliance
Cohere	2	$0.50/M	$10/M	RAG, enterprise search
Moonshot	1	$0.90/M	$3.75/M	Long context (256K)
xAI	2	$3.00/M	$150/M	Real-time data
AI21	1	$2.00/M	$8/M	Long context (256K)

What to Watch in Q4 2026

⚠️ Q4 Is the Big One

Historically, Q4 is when OpenAI, Google, and Anthropic make their biggest announcements. If you're planning a major AI integration, consider locking in current pricing with committed-use discounts before potential disruption.

OpenAI GPT-6 — The traditional Q4 launch window. A new generation could reset the entire pricing curve. If GPT-6 arrives, expect GPT-5 prices to drop.
Google Gemini 3.0 Flash — Gemini 2.5 Flash-Lite has held the budget crown for months. A successor could push below $0.05/M and break the 133K-tokens-per-penny threshold.
DeepSeek V5 — V4 Pro at $0.44/M is already aggressive. V5 could disrupt the mid-tier and pressure Anthropic/OpenAI to respond.
xAI pricing rebrand — Grok 4.3 at $1.25/$2.50 is now competitive; Grok Build 0.1 at $1.00/$2.00 joins the budget tier.
Meta Llama 5 — Llama 4 Scout (1M context) was a breakthrough. Llama 5 could push context further or improve quality to compete with mid-tier models.
Anthropic Claude 5 — The Claude 4 family is mature. A new generation could arrive in late 2026, potentially reshuffling the premium tier.

💡 Cost Optimization Tip

If you're still using GPT-4o ($2.50/M) for general tasks, switching to Claude Haiku 4.5 ($1/M) saves 60% with comparable quality. For classification/extraction, GPT-4o mini ($0.15/M) is 17x cheaper than GPT-4o. Use our Cost Optimizer to find savings in your current setup.

Methodology

All pricing data comes from official provider pricing pages, verified as of Jun 3, 2026. We track 32 active models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (via Together.ai), Moonshot, xAI, and AI21.

Prices are per 1 million tokens unless otherwise noted. Context window sizes reflect the maximum supported. Some providers offer batch pricing or committed-use discounts not reflected here.

Calculate your exact costs

Use our free tools to see what these prices mean for your workload. No signup required.

Open Cost Calculator →

Related Tools

AI API Cost Calculator — estimate costs for any model
Cost Optimizer — find savings in your current setup
Cost Explorer — see all models ranked by cost
Model Compare — side-by-side model comparison
Pricing Index — complete sortable pricing database
Cheapest AI API Finder — find the lowest-cost option
AI API Pricing August 2026 — previous month's guide
State of LLM Pricing Report — interactive June report

🔌 Free MCP Server →

← August 2026 Pricing Guide October 2026 Pricing Guide →

Want to optimize your AI API costs?

APIpulse includes free cost comparisons, exports, and recommendations that can save you up to 40%.

Free Cost Audit →