Reference April 29, 2026 5 min read

AI API Cost per Request: Quick Reference Table

How much does a single API call actually cost? We calculated it for all 33 models across 10 providers at four common request sizes. Bookmark this page.

Assumption: Each request sends 3x more input tokens than output tokens (typical for chat, RAG, and code assistant workloads). Costs are per single request. All prices verified May 2026.

All 33 Models — Cost per Request

Sorted cheapest to most expensive. At 1K tokens, costs range from $0.000100 (Llama 3.1 8B) to $0.067500 (GPT-5.5 Pro) — a 675x gap.

Model	Tier	Provider	100 tok	500 tok	1K tok	5K tok
Llama 3.1 8B	Budget	Meta (Together.ai)	$0.000010	$0.000050	$0.000100	$0.000500
GPT-oss 20B	Budget	OpenAI	$0.000015	$0.000074	$0.000148	$0.000737
Llama 4 Scout	Budget	Meta (Together.ai)	$0.000017	$0.000084	$0.000168	$0.000838
Gemini 2.0 Flash	Budget	Google	$0.000017	$0.000087	$0.000175	$0.000875
DeepSeek V4 Flash	Budget	DeepSeek	$0.000018	$0.000087	$0.000175	$0.000875
Mistral Small 4	Budget	Mistral	$0.000026	$0.000131	$0.000262	$0.001313
GPT-4o mini	Budget	OpenAI	$0.000026	$0.000131	$0.000262	$0.001313
GPT-oss 120B	Budget	OpenAI	$0.000026	$0.000131	$0.000262	$0.001313
Llama 4 Maverick	Budget	Meta (Together.ai)	$0.000030	$0.000150	$0.000300	$0.001500
DeepSeek V3	Budget	DeepSeek	$0.000048	$0.000239	$0.000478	$0.002387
DeepSeek V4 Pro	Budget	DeepSeek	$0.000055	$0.000274	$0.000548	$0.002737
GPT-5 mini	Budget	OpenAI	$0.000070	$0.000350	$0.000700	$0.003500
Command R	Budget	Cohere	$0.000075	$0.000375	$0.000750	$0.003750
Mistral Large 3	Budget	Mistral	$0.000075	$0.000375	$0.000750	$0.003750
Llama 3.1 70B	Mid	Meta (Together.ai)	$0.000088	$0.000440	$0.000880	$0.004400
Claude Haiku 4.5	Budget	Anthropic	$0.000160	$0.000800	$0.001600	$0.008000
Kimi K2.6	Budget	Moonshot	$0.000161	$0.000806	$0.001613	$0.008063
Gemini 2.5 Pro	Mid	Google	$0.000344	$0.001719	$0.003438	$0.017188
Grok 3 Mini	Mid	xAI	$0.000350	$0.001750	$0.003500	$0.017500
Jamba 1.5 Large	Mid	AI21	$0.000350	$0.001750	$0.003500	$0.017500
Command R+	Mid	Cohere	$0.000438	$0.002188	$0.004375	$0.021875
GPT-4o	Mid	OpenAI	$0.000438	$0.002188	$0.004375	$0.021875
Gemini 3.1 Pro	Mid	Google	$0.000450	$0.002250	$0.004500	$0.022500
GPT-5.3 Codex	Mid	OpenAI	$0.000481	$0.002406	$0.004812	$0.024063
Claude Sonnet 4	Mid	Anthropic	$0.000600	$0.003000	$0.006000	$0.030000
Claude Sonnet 4.6	Mid	Anthropic	$0.000600	$0.003000	$0.006000	$0.030000
Claude Opus 4.7	Premium	Anthropic	$0.001000	$0.005000	$0.010000	$0.050000
GPT-5.5	Premium	OpenAI	$0.001125	$0.005625	$0.011250	$0.056250
GPT-5	Premium	OpenAI	$0.001500	$0.007500	$0.015000	$0.075000
Claude 4 Opus	Premium	Anthropic	$0.003000	$0.015000	$0.030000	$0.150000
Grok 3	Premium	xAI	$0.006000	$0.030000	$0.060000	$0.300000
GPT-5.5 Pro	Premium	OpenAI	$0.006750	$0.033750	$0.067500	$0.337500

Key Takeaways

The 675x Gap

The cheapest model (Llama 3.1 8B at $0.000100/request) costs 675x less than the most expensive (GPT-5.5 Pro at $0.067500/request) for a 1K-token request. At 5K tokens, the gap holds at 675x.

For high-volume chatbots: Llama 3.1 8B, GPT-oss 20B, or DeepSeek V4 Flash — all under $0.0002 per 1K-token request
For production code assistants: GPT-4o or Claude Sonnet 4 — $0.004–$0.006 per request with strong reasoning
For complex research: Claude 4 Opus or GPT-5 — $0.015–$0.03 per request, but best-in-class quality
Best value in mid-tier: Llama 3.1 70B at $0.00088/request — 5x cheaper than GPT-4o with comparable quality
Hidden winner: DeepSeek V4 Pro at $0.00055/request — mid-tier quality at budget prices (75% discount through May 2026)

Calculate your exact monthly costs across all 33 models

Open the Calculator — Free

How to Use This Table

These costs assume a 3:1 input-to-output token ratio. Your actual costs depend on your specific workload:

Chatbots: Usually 2:1 to 4:1 ratio — this table is accurate
Code generation: Often 1:3 or higher — output-heavy, so multiply output costs
Document analysis: Often 10:1 or higher — input-heavy, costs are lower than shown
RAG pipelines: Usually 5:1 — input-heavy with retrieved context

For exact calculations with your token ratios, use our interactive calculator or token estimator.

AI API Cost per Request: Quick Reference Table

All 33 Models — Cost per Request

Key Takeaways

The 675x Gap

How to Use This Table

Related Reading