What are the new Gemini models released in June 2026?

Google released 3 new models: Gemini 3.1 Flash-Lite ($0.25/$1.50 per 1M tokens, 1M context), Gemini 3 Flash ($0.50/$3.00 per 1M tokens, 1M context), and Gemini 2.5 Flash-Lite ($0.10/$0.40 per 1M tokens, 1M context). All three have 1M token context windows.

Which new Gemini model is cheapest?

Gemini 2.5 Flash-Lite is the cheapest at $0.10/$0.40 per 1M tokens — matching the deprecated Gemini 2.0 Flash pricing but with a 1M context window. Gemini 3.1 Flash-Lite at $0.25/$1.50 is the best budget pick for production workloads.

Are the old Gemini 2.0 models deprecated?

Yes. Google has marked Gemini 2.0 Flash and Gemini 2.0 Flash Lite as deprecated. Gemini 2.0 Flash is replaced by Gemini 3 Flash ($0.50/$3.00) and Gemini 2.0 Flash Lite is replaced by Gemini 3.1 Flash-Lite ($0.25/$1.50). Both new models have larger 1M context windows.

Blog · Jun 12, 2026

3 New Gemini Models: Flash-Lite, 3 Flash & 2.5 Flash-Lite — Pricing & Comparison

⚠️ Claude 4 Deprecation Alert: Claude 4 models retire on June 15, 2026 (). If you use Claude 4, see our last-chance migration guide or use the deprecation calculator.

Google dropped 3 new Gemini models this week — all with 1M context windows, all cheaper than you'd expect. Here's what each one costs and when to use it.

Google quietly released three new Gemini API models in June 2026, bringing the total Gemini lineup to 8 models. The big news: all three have 1M token context windows and aggressive budget pricing. If you're still using Gemini 2.0 Flash (now deprecated), it's time to upgrade.

The 3 New Models at a Glance

Model	Tier	Input	Output	Context	Replaces
Gemini 2.5 Flash-Lite	Budget	$0.10	$0.40	1M	Gemini 2.0 Flash-Lite
Gemini 3.1 Flash-Lite	Budget	$0.25	$1.50	1M	—
Gemini 3 Flash	Budget	$0.50	$3.00	1M	Gemini 2.0 Flash

All prices per 1M tokens. Verified Jun 12, 2026. Gemini 2.0 Flash and Gemini 2.0 Flash Lite are now deprecated by Google.

1. Gemini 2.5 Flash-Lite — The Cheapest Option

At $0.10/$0.40 per 1M tokens, this is the cheapest Gemini model ever — and one of the cheapest LLM APIs on the market. It matches the deprecated Gemini 2.0 Flash pricing but adds a massive 1M token context window (up from 1M on 2.0 Flash, same size but newer architecture).

Best for: High-volume classification, content moderation, chatbot backends, embeddings preprocessing — anything where you're processing millions of tokens and cost is the primary concern.

Trade-off: As a "Lite" model, it may not handle complex reasoning as well as Gemini 3 Flash or Gemini 2.5 Pro. Use it for simple, high-volume tasks.

2. Gemini 3.1 Flash-Lite — The Production Budget Pick

At $0.25/$1.50 per 1M tokens, Gemini 3.1 Flash-Lite sits in a sweet spot: 2.5x more expensive than 2.5 Flash-Lite on input, but with better quality output (1.50 vs 0.40 — that's 3.75x more, reflecting better generation quality). The 1M context window handles long documents and codebases with ease.

Best for: Production workloads that need good quality at low cost — chatbots, content generation, summarization, RAG pipelines. The go-to replacement for deprecated Gemini 2.0 Flash Lite.

Trade-off: Output at $1.50/M is 3.75x the input cost. For output-heavy workloads, Gemini 2.5 Flash-Lite ($0.40 output) is cheaper — but lower quality.

3. Gemini 3 Flash — The Speed Demon

At $0.50/$3.00 per 1M tokens, Gemini 3 Flash is the replacement for deprecated Gemini 2.0 Flash. It's 5x more expensive on input ($0.50 vs $0.10) but brings the Gemini 3 architecture: faster inference, better multimodal support, and the full 1M context window.

Best for: Applications that need speed + quality at a reasonable price. Great for real-time chat, code generation, multimodal tasks (image + text), and Google Cloud workloads.

Trade-off: 5x the cost of Gemini 2.5 Flash-Lite. If you don't need the Gemini 3 architecture improvements, the cheaper models are fine.

Monthly Cost at 1,000 Requests/Day

Assuming 1,000 input tokens and 500 output tokens per request:

2.5 Flash-Lite

$4.00

3.1 Flash-Lite

$22.50

3 Flash

$45.00

Claude Haiku 4.5

$35.00

GPT-4o mini

$9.00

How These Compare to the Competition

Here's how the new Gemini budget models stack up against other cheap options:

Gemini 2.5 Flash-Lite vs DeepSeek V4 Flash: Gemini is cheaper on input ($0.10 vs $0.14) but more expensive on output ($0.40 vs $0.28). DeepSeek wins on output-heavy tasks.
Gemini 3.1 Flash-Lite vs Claude Haiku 4.5: Gemini is 75% cheaper on input ($0.25 vs $1.00) and 70% cheaper on output ($1.50 vs $5.00). Haiku has better reasoning, but for most tasks Gemini is the better value.
Gemini 3 Flash vs GPT-4o mini: GPT-4o mini is cheaper at $0.15/$0.60 vs Gemini 3 Flash at $0.50/$3.00. But Gemini 3 Flash has 1M context vs GPT-4o mini's 128K — 8x more context for 3x the price.
All 3 vs Mistral Small 4: Mistral Small 4 at $0.10/$0.30 is cheaper on input than Gemini 3.1 Flash-Lite ($0.25/$1.50) but has only 128K context vs 1M. For long documents, the new Gemini models win.

Which One Should You Use?

Absolute cheapest

Gemini 2.5 Flash-Lite — $0.10/$0.40, can't beat it

Best quality per dollar

Gemini 3.1 Flash-Lite — $0.25/$1.50 with 1M context

Speed + quality balance

Gemini 3 Flash — $0.50/$3.00, fastest inference

Need 1M context on a budget

Any of these — all have 1M windows

The Bigger Picture

These 3 new models push the APIpulse database to 42 models across 10 providers. The Gemini lineup now covers every price point from $0.10 to $12.50 per 1M input tokens — the widest range of any single provider.

The trend is clear: 1M context windows are becoming the standard, even at budget price points. If you're still using models with 128K context, you're paying for less capability than you could get.

With Claude 4 shutting down June 15, now is the perfect time to evaluate these new Gemini options. Use our deprecation calculator to see how much you'd save by switching.

Compare all 42 models side by side

Open Cost Calculator

Pricing data verified Jun 12, 2026. Prices per 1M tokens. See our full pricing table for all 42 models.