DeepSeek V4 API Pricing: The Cheapest AI API?

DeepSeek just launched V4 with two tiers: a Pro model at $2.18/$8.72 and a Flash model at just $0.14/$0.28 per 1M tokens. That Flash price undercuts nearly every competitor. Let's break down what you actually get.

Update (May 2, 2026): DeepSeek V4 Pro has dropped by 75% — now just $0.44/$0.87 per 1M tokens. Read our May 2026 Pricing Shakeup analysis.

The Full DeepSeek V4 Lineup

V4 Pro
$2.18 / $8.72
Input / Output per 1M tokens

128K context

V4 Flash
$0.14 / $0.28
Input / Output per 1M tokens

128K context

V3 (Legacy)
$0.27 / $1.10
Input / Output per 1M tokens

128K context

DeepSeek V4 Flash at $0.14/$0.28 is aggressively cheap. The input price is competitive with Gemini 2.0 Flash ($0.10) and Mistral Small 4 ($0.10), but the output price of $0.28 is the lowest among all major budget models — 53% cheaper than GPT-4o mini's $0.60 output.

Budget Model Showdown

Here's how DeepSeek V4 Flash compares to every major budget-tier API:

ModelInput/1MOutput/1MContextBlended*
DeepSeek V4 Flash$0.14$0.28128K$0.19
Gemini 2.0 Flash$0.10$0.401M$0.20
Mistral Small 4$0.10$0.3032K$0.17
GPT-oss 20B$0.08$0.35128K$0.17
GPT-4o mini$0.15$0.60128K$0.30
Claude Haiku 4.5$1.00$5.00200K$1.90

*Blended cost assumes a 3:1 input-to-output ratio, typical for chat workloads.

The real story: output pricing

Input prices across budget models are clustered between $0.08 and $0.15 — the differences are small. The real gap is on the output side, where prices range from $0.28 (DeepSeek V4 Flash) to $4.00 (Claude Haiku). If your workload is output-heavy, DeepSeek V4 Flash can save you up to 93% compared to Haiku.

Cost Comparison by Use Case

1. Chatbot (500 requests/day, 1500 input + 800 output tokens)

ModelInput/moOutput/moTotal/mo
DeepSeek V4 Flash$10.50$33.60$44.10
Gemini 2.0 Flash$7.50$48.00$55.50
GPT-4o mini$11.25$72.00$83.25
Claude Haiku 4.5$75.00$600.00$675.00

Winner: DeepSeek V4 Flash — $44/month for a chatbot processing 15K requests. That's 20% cheaper than Gemini Flash and 94% cheaper than Claude Haiku.

2. Code Assistant (200 requests/day, 2000 input + 1500 output tokens)

ModelInput/moOutput/moTotal/mo
DeepSeek V4 Flash$5.60$25.20$30.80
Gemini 2.0 Flash$4.00$36.00$40.00
GPT-4o mini$6.00$54.00$60.00
Claude Haiku 4.5$40.00$450.00$490.00

Winner: DeepSeek V4 Flash — $31/month for a code assistant. Output-heavy workloads amplify the savings.

3. Document Classification (1000 requests/day, 500 input + 100 output tokens)

ModelInput/moOutput/moTotal/mo
GPT-oss 20B$1.20$1.05$2.25
DeepSeek V4 Flash$2.10$0.84$2.94
Gemini 2.0 Flash$1.50$1.20$2.70
GPT-4o mini$2.25$1.80$4.05

Winner: GPT-oss 20B — for input-heavy classification tasks, the cheapest input price ($0.08) wins. DeepSeek V4 Flash is a close second.

DeepSeek V4 Pro: The Mid-Tier Option

At $2.18/$8.72, DeepSeek V4 Pro sits in mid-tier territory. How does it compare?

ModelInput/1MOutput/1MContext
DeepSeek V4 Pro$2.18$8.72128K
Claude Sonnet 4$3.00$15.00200K
GPT-4o$2.50$10.00128K
Gemini 2.5 Pro$1.25$10.001M
Mistral Large 3$2.00$6.00128K

DeepSeek V4 Pro is cheaper than Claude Sonnet 4 and GPT-4o on both input and output. It's more expensive than Gemini 2.5 Pro on input but cheaper on output. For teams that need stronger reasoning than budget models but don't want to pay flagship prices, V4 Pro is a compelling middle ground.

When to Choose DeepSeek V4

When to Avoid DeepSeek

The Data Residency Question

Where does your data go?

DeepSeek is a Chinese company. While they offer an API hosted on international infrastructure, organizations with strict data residency requirements (GDPR, HIPAA, SOC 2) should verify where their data is processed. For sensitive workloads, consider US/EU-based alternatives even if they cost more.

Cost Optimization with DeepSeek

  1. Tier your models: Use V4 Flash for high-volume simple tasks, V4 Pro for complex reasoning
  2. Set max_tokens: At $0.28/1M output, runaway generation is still wasteful at scale
  3. Cache prompts: DeepSeek supports prompt caching — reuse system prompts to reduce input costs
  4. Batch processing: For offline workloads, batch requests to maximize throughput

Calculate your DeepSeek costs: Use our free calculator to see exactly what DeepSeek V4 would cost for your workload — and compare it side-by-side with every other provider.

Try the APIpulse Calculator