AI just got incredibly cheap. In June 2026, there are 15 API models that cost under $1 per million input tokens — down from just 3 a year ago. The cheapest starts at $0.075/M, which means you can process 13 million tokens for a single dollar. This is the complete, sorted list of every model under $1/M.
We track 15 models across 8 providers that fall under the $1/M threshold. All prices are per 1M tokens, verified as of June 10, 2026.
Complete Ranking: All 15 Models Under $1/M
Every AI API model priced under $1/M input tokens, ranked by total cost (input + output) per 1M tokens. Sorted cheapest first.
| # | Model | Input | Output | Total | Context | Provider |
|---|---|---|---|---|---|---|
| 1 | Llama 3.1 8B | $0.10 | $0.10 | $0.20 | 128K | Meta |
| 2 | Gemini 2.0 Flash Lite | $0.075 | $0.30 | $0.375 | 1M | |
| 3 | GPT-oss 20B | $0.08 | $0.35 | $0.43 | 128K | OpenAI |
| 4 | DeepSeek V4 Flash | $0.14 | $0.28 | $0.42 | 1M | DeepSeek |
| 5 | Gemini 2.0 Flash | $0.10 | $0.40 | $0.50 | 1M | |
| 6 | DeepSeek V3.2 | $0.23 | $0.34 | $0.57 | 128K | DeepSeek |
| 7 | GPT-oss 120B | $0.15 | $0.60 | $0.75 | 128K | OpenAI |
| 8 | GPT-4o mini | $0.15 | $0.60 | $0.75 | 128K | OpenAI |
| 9 | Mistral Small 4 | $0.15 | $0.60 | $0.75 | 128K | Mistral |
| 10 | Llama 4 Scout | $0.18 | $0.59 | $0.77 | 1M | Meta |
| 11 | Llama 4 Maverick | $0.27 | $0.85 | $1.12 | 1M | Meta |
| 12 | DeepSeek V4 Pro | $0.435 | $0.87 | $1.305 | 1M | DeepSeek |
| 13 | Mistral Large 3 | $0.50 | $1.50 | $2.00 | 262K | Mistral |
| 14 | Command R | $0.50 | $1.50 | $2.00 | 128K | Cohere |
| 15 | Kimi K2.6 | $0.95 | $4.00 | $4.95 | 256K | Moonshot |
Notice something interesting: the cheapest model by input price isn't the cheapest by total cost. Llama 3.1 8B costs $0.10/M input but only $0.10/M output, giving it the lowest total at $0.20. Gemini Flash Lite has the lowest input at $0.075 but costs 3x more on output.
Calculate Your Exact Costs
Enter your token usage and see exactly how much each of these 15 models costs for your workload.
Open Cost Calculator →Price Tier Breakdown
These 15 models split into three clear tiers based on input pricing. Here's what you get at each level.
Under $0.15/M input
- Gemini 2.0 Flash Lite — $0.075/$0.30 · 1M context · Google's cheapest model
- GPT-oss 20B — $0.08/$0.35 · 128K context · OpenAI's open-source entry
- Llama 3.1 8B — $0.10/$0.10 · 128K context · Open source, lowest total cost
- Gemini 2.0 Flash — $0.10/$0.40 · 1M context · Google's balanced budget option
- DeepSeek V4 Flash — $0.14/$0.28 · 1M context · Best budget coding model
Best for: high-volume classification, simple Q&A, data extraction, embedding pipelines
$0.15 — $0.30/M input
- GPT-oss 120B — $0.15/$0.60 · 128K context · Strong general-purpose
- GPT-4o mini — $0.15/$0.60 · 128K context · OpenAI's budget workhorse
- Mistral Small 4 — $0.15/$0.60 · 128K context · EU data sovereignty
- Llama 4 Scout — $0.18/$0.59 · 1M context · Open source, MIT license
- DeepSeek V3.2 — $0.23/$0.34 · 128K context · Proven production model
- Llama 4 Maverick — $0.27/$0.85 · 1M context · Open source flagship
Best for: chatbots, content generation, RAG pipelines, code assistance
$0.40 — $1.00/M input
- DeepSeek V4 Pro — $0.435/$0.87 · 1M context · Best value for complex tasks
- Mistral Large 3 — $0.50/$1.50 · 262K context · Strong at RAG and retrieval
- Command R — $0.50/$1.50 · 128K context · Cohere's enterprise RAG model
- Kimi K2.6 — $0.95/$4.00 · 256K context · Excellent reasoning capabilities
Best for: code generation, complex analysis, RAG, nuanced writing
Use Case Recommendations
Different tasks need different models. Here's the best sub-$1 model for each major use case.
Chatbot
$0.14/$0.28 — cheapest model that handles multi-turn conversations naturally. Used in production by thousands of apps.
Code Generation
$0.435/$0.87 — outperforms GPT-4o on coding benchmarks at 80% less cost. Best value coding model under $1.
RAG Pipeline
$0.50/$1.50 — purpose-built for retrieval-augmented generation with strong context following and factual accuracy.
Content Writing
$0.95/$4.00 — excellent reasoning and long-form generation. The most capable writing model under $1/M input.
Data Extraction
$0.10/$0.10 — lowest total cost at $0.20/M. Perfect for high-volume structured extraction tasks.
Multilingual
$0.15/$0.60 — strong multilingual support with EU data sovereignty. Handles 30+ languages well.
No Vendor Lock-in
$0.18/$0.59 — open source MIT license, self-hostable, 1M context window. Full control over your stack.
Highest Volume
$0.075/$0.30 — cheapest input price of any model. When you need to process millions of tokens daily.
Track Every Dollar with APIpulse Pro
Set cost alerts, compare models in real-time, and optimize your API spend across all 15 budget models. $29/month.
Get APIpulse Pro →Provider Breakdown
Eight providers offer models under $1/M input in June 2026. Here's how they compare.
- Google — 2 models. Cheapest input price with Gemini Flash Lite ($0.075). Both models have 1M context windows.
- OpenAI — 3 models. GPT-oss 20B and 120B are open-source. GPT-4o mini is the industry standard budget model.
- Meta — 3 models. Llama 3.1 8B is the lowest total cost. Llama 4 Scout and Maverick both have 1M context, MIT license.
- DeepSeek — 3 models. V4 Flash ($0.14) is best for coding. V3.2 is proven in production. V4 Pro is best value for complex tasks.
- Mistral — 2 models. EU-based with data sovereignty. Mistral Small 4 competes directly with GPT-4o mini.
- Cohere — 1 model. Command R at $0.50 is purpose-built for RAG and enterprise search.
- Moonshot — 1 model. Kimi K2.6 at $0.95 is the most capable reasoning model under $1/M input.
Compare All 53 Models Side by Side
Our comparison tool lets you filter by price, context window, provider, and capabilities across every tracked model.
Open Comparison Tool →The Bottom Line
The era of expensive AI is over. With 15 models under $1/M input tokens, every startup and indie developer can afford production-quality AI. The cheapest option, Gemini 2.0 Flash Lite at $0.075/M, lets you process 13 million tokens for a dollar.
Here's the quick decision tree for choosing among these 15 models:
- Lowest possible cost → Llama 3.1 8B ($0.10/$0.10, total $0.20)
- Cheapest input for high volume → Gemini 2.0 Flash Lite ($0.075 input)
- Best production chatbot → DeepSeek V4 Flash ($0.14/$0.28)
- Best budget coding → DeepSeek V4 Pro ($0.435/$0.87)
- No vendor lock-in → Llama 4 Scout ($0.18/$0.59, MIT license)
- EU data sovereignty → Mistral Small 4 ($0.15/$0.60)
- Best RAG on a budget → Command R ($0.50/$1.50)
- Best reasoning under $1 → Kimi K2.6 ($0.95/$4.00)
Use the APIpulse cost calculator to model your exact usage and find the cheapest model that meets your quality bar.