Cheapest AI API in June 2026: All 34 Models Ranked by Cost
Looking for the cheapest AI API? We track pricing across all major providers and update monthly. Here's the definitive ranking of every AI model by cost, with real-world scenarios so you can pick the right model for your budget.
TL;DR: The cheapest AI API in June 2026 is Gemini 2.0 Flash Lite at $0.075/$0.30 per 1M tokens. For quality-sensitive tasks, DeepSeek V4 Pro ($0.44/$0.87) offers the best price-to-performance ratio.
All 34 AI Models Ranked by Input Cost
Try It Live — Instant Cost Calculator
See exactly what this model costs for your workload. No signup needed.
Prices are per 1M tokens. All data verified May 29, 2026. Full pricing table →
| # | Model | Provider | Input | Output | Context |
|---|---|---|---|---|---|
| 1 | Gemini 2.0 Flash Lite | $0.075 | $0.30 | 1M | |
| 2 | GPT-oss 20B | OpenAI | $0.08 | $0.35 | 128K |
| 3 | Gemini 2.0 Flash | $0.10 | $0.40 | 1M | |
| 4 | Llama 3.1 8B | Meta (Together.ai) | $0.10 | $0.10 | 128K |
| 5 | Llama 4 Scout | Meta (Together.ai) | $0.11 | $0.34 | 10M |
| 6 | DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1M |
| 7 | GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K |
| 8 | GPT-oss 120B | OpenAI | $0.15 | $0.60 | 128K |
| 9 | Mistral Small 4 | Mistral | $0.15 | $0.60 | 128K |
| 10 | Llama 4 Maverick | Meta (Together.ai) | $0.20 | $0.60 | 10M |
| 11 | GPT-5 mini | OpenAI | $0.25 | $2.00 | 272K |
| 12 | DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 128K |
| 13 | Mistral Large 3 | Mistral | $0.50 | $1.50 | 128K |
| 14 | Command R | Cohere | $0.50 | $1.50 | 128K |
| 15 | DeepSeek V4 Pro | DeepSeek | $0.44 | $0.87 | 1M |
| 16 | Kimi K2.6 | Moonshot | $0.90 | $3.75 | 256K |
| 17 | Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K |
| 18 | Gemini 2.5 Pro | $1.25 | $10.00 | 1M | |
| 19 | GPT-5 | OpenAI | $1.25 | $10.00 | 272K |
| 20 | GPT-5.3 Codex | OpenAI | $1.75 | $14.00 | 400K |
| 21 | Gemini 3.1 Pro | $2.00 | $12.00 | 1M | |
| 22 | Jamba 1.5 Large | AI21 | $2.00 | $8.00 | 256K |
| 23 | GPT-4o | OpenAI | $2.50 | $10.00 | 128K |
| 24 | Command R+ | Cohere | $2.50 | $10.00 | 128K |
| 25 | Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | 200K |
| 26 | Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 1M |
| 27 | Grok Build 0.1 | xAI | $0.30 | $0.50 | 256K |
| 28 | GPT-5.5 | OpenAI | $5.00 | $30.00 | 1M |
| 29 | Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | 1M |
| 30 | Claude Opus 4.8 | Anthropic | $5.00 | $25.00 | 1M |
| 31 | Llama 3.1 70B | Meta (Together.ai) | $0.88 | $0.88 | 128K |
| 32 | Claude 4 Opus | Anthropic | $15.00 | $75.00 | 200K |
| 33 | Grok 4.3 | xAI | $1.25 | $2.50 | 1M |
| 34 | GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | 1M |
Real Cost Scenarios
Let's see what these prices actually mean for real workloads. We'll model three common use cases.
Scenario 1: Simple Chatbot (10K requests/day)
A customer support chatbot with ~500 input tokens and ~300 output tokens per request.
Monthly Cost Comparison
Scenario 2: Code Assistant (5K requests/day)
A coding assistant with ~2,000 input tokens and ~4,000 output tokens per request (longer outputs).
Monthly Cost Comparison
Scenario 3: RAG Pipeline (1K requests/day)
A RAG system with ~4,000 input tokens (context + query) and ~1,000 output tokens.
Monthly Cost Comparison
Best Budget Models by Use Case
Best for Simple Tasks (Classification, Extraction, Formatting)
Gemini 2.0 Flash Lite ($0.075/$0.30) or Llama 3.1 8B ($0.10/$0.10). These models handle straightforward tasks with minimal quality loss at 90%+ savings vs premium models.
Best for Code Generation
DeepSeek V4 Pro ($0.44/$0.87). Best price-to-quality ratio for coding. 1M context window handles large codebases. For simpler completions, Mistral Small 4 ($0.15/$0.60) is a strong budget pick.
Best for Long Context
Llama 4 Scout ($0.11/$0.34) with 10M context or Gemini 2.0 Flash ($0.10/$0.40) with 1M context. Both are extremely affordable for long-document processing.
Best for RAG Pipelines
DeepSeek V4 Flash ($0.14/$0.28). 1M context window, very low output cost. For RAG with shorter contexts, GPT-4o mini ($0.15/$0.60) is also competitive.
Best Quality-per-Dollar (Mid-Tier)
DeepSeek V4 Pro ($0.44/$0.87) or Mistral Large 3 ($0.50/$1.50). Both offer strong reasoning at a fraction of GPT-5/Claude Sonnet pricing.
How to Choose the Right Model
The cheapest model isn't always the best choice. Here's a decision framework:
- Start with the task complexity. Simple tasks (classification, extraction, formatting) → budget models. Complex reasoning, code generation, creative writing → mid-tier or premium.
- Consider output length. If your outputs are long (code, analysis), output pricing matters more. DeepSeek and Gemini have the lowest output costs.
- Check context window needs. If you're processing long documents, you need 1M+ context. Gemini Flash and Llama 4 Scout are the cheapest with large contexts.
- Test before committing. Use our Prompt Cost Calculator to model your exact workload across all 34 models.
Calculate Your Exact Costs
Paste your actual prompt and see costs across all 34 models. Find the cheapest option for your specific workload.
Open Calculator — FreeKey Takeaways
- Gemini 2.0 Flash Lite is the cheapest AI API at $0.075/$0.30 per 1M tokens
- Budget models are 30-400x cheaper than premium models for most tasks
- DeepSeek V4 Pro offers the best price-to-quality ratio for complex tasks
- Output pricing matters more than input pricing for most use cases
- Test with real prompts — use our calculator to find the sweet spot for your workload
Methodology
All prices sourced directly from provider pricing pages, verified May 29, 2026. Prices are per 1M tokens. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (Together.ai), Moonshot, xAI, and AI21. Data is updated monthly. See pricing changelog →
Compare models: APIpulse Model Comparison — side-by-side pricing for 34 models across 10 providers. Free tool.