Gemini API Pricing History (2024–2026): How Token Costs Evolved + What You Pay Today

Lisa Ernst · 19.02.2026 · Artificial Intelligence · 9 min

Gemini’s pricing has changed meaningfully since the Gemini API went paid in 2024 — especially for high-volume “Flash” usage and long-context “Pro” workloads. If you build production features (agents, summarization pipelines, search grounding, coding assistants), understanding the price evolution matters as much as understanding today’s rate card.

Official current rates live here: Gemini Developer API pricing. This article focuses on the biggest price milestones (2024 → 2026) and then summarizes what you pay today.

On this page

Quick Summary
Pricing Timeline (2024–2026)
Major Price Drops (Before/After Tables)
Current Pricing Snapshot (Gemini 2.5 + Gemini 3 Preview)
Cost Examples (Real Numbers)
Cost-Saving Levers (Batch, Caching, Grounding)
FAQ

Quick Summary

2024 → 2026 = major repricing: Flash became dramatically cheaper (high volume), Pro got large cuts for <128K prompts in 2024, and modern 2.5 / 3 models introduced “thinking-token” accounting in output pricing.
Best “history view”: follow the timeline + the before/after tables below (official charts + numbers).
What you pay today: see the “Current Pricing Snapshot” table (2.5 Pro / 2.5 Flash / 2.5 Flash-Lite + Gemini 3 previews).
Optimization levers: Batch pricing can halve costs for non-urgent workloads; caching reduces repeated-context costs; grounding has separate per-request fees.

Tip: If your bill surprises you, it’s almost always because of output (and for “thinking” models, “thinking tokens” are counted in output pricing) or because long prompts cross a higher-tier threshold (historically: 128K; now commonly: 200K).

Pricing Timeline (2024–2026)

Google publishes current prices on the official pricing page and announces major changes on the Google Developers Blog. The table below highlights the most important milestones that affected real-world API spend.

Key pricing milestones
Effective	What changed	Impact	Official reference
May 2024	Paid pricing published	Baseline token prices for Gemini 1.5 Pro and Gemini 1.5 Flash were published for the paid tier. (Long prompts had a higher priced tier.)	Archived pricing page snapshot
Aug 12, 2024	Flash price drop	Massive reduction for Flash: input down to $0.075 / 1M, output to $0.30 / 1M (for prompts under 128K tokens), plus lower caching rates.	Google Developers Blog (Aug 8, 2024)
Oct 1, 2024	Pro price drop	Gemini 1.5 Pro got major reductions for prompts under 128K (and also lower long-context tier pricing). New under-128K: $1.25 input and $5.00 output per 1M tokens.	Google Developers Blog (Sep 24, 2024)
2026 (today)	2.5 + 3 preview pricing	Current pricing includes 2.5 family and Gemini 3 previews; some tables explicitly state output price (including thinking tokens).	Current pricing page

Major Price Drops (Before/After Tables)

These two charts are the cleanest way to “see” the price evolution. They come straight from official Google announcements and show the old vs. new rates for the same model.

Gemini 1.5 Flash pricing: before vs after (effective Aug 12, 2024)

Source: developers.googleblog.com

Gemini 1.5 Flash: large price drop effective Aug 12, 2024 (official chart). It shows input/output + caching changes for both ≤128K and >128K tiers.

Gemini 1.5 Pro pricing: before vs after (effective Oct 1, 2024)

Source: developers.googleblog.com

Gemini 1.5 Pro: price reduction effective Oct 1, 2024 (official chart). Under 128K prompts dropped from $3.50 → $1.25 input and $10.50 → $5.00 output per 1M tokens.

Gemini 1.5 price evolution (USD per 1M tokens)
Model	Effective	Tier	Input	Output	Source
Gemini 1.5 Flash	May 2024	≤128K tokens	$0.35	$0.53	Archived pricing page
Gemini 1.5 Flash	May 2024	>128K tokens	$0.70	$1.05	Archived pricing page
Gemini 1.5 Flash	Aug 12, 2024	≤128K tokens	$0.075	$0.30	Google Developers Blog
Gemini 1.5 Flash	Aug 12, 2024	>128K tokens	$0.15	$0.60	Google Developers Blog
Gemini 1.5 Pro	May 2024	≤128K tokens	$3.50	$10.50	Archived pricing page
Gemini 1.5 Pro	May 2024	>128K tokens	$7.00	$21.00	Archived pricing page
Gemini 1.5 Pro	Oct 1, 2024	≤128K tokens	$1.25	$5.00	Google Developers Blog
Gemini 1.5 Pro	Oct 1, 2024	>128K tokens	$2.50	$10.00	Google Developers Blog

Note: Gemini 1.5 models had an even larger context window (up to 2M tokens) and were frequently repriced as the product matured. For current production work, use the pricing page for today’s supported models.

Current Pricing Snapshot (Gemini 2.5 + Gemini 3 Preview)

Below is a “today view” (token prices per 1M tokens, USD) pulled from the official pricing page. Pay attention to the wording “output price (including thinking tokens)” on newer models.

Current token prices (USD per 1M tokens)
Model	Use case	Input price	Output price	Notes
Gemini 3 Pro Preview gemini-3-pro-preview	Most powerful multimodal + agentic workloads	$2.00 (≤200k) $4.00 (>200k)	$12.00 (≤200k) $18.00 (>200k) (incl. thinking tokens)	Preview model
Gemini 3 Flash Preview gemini-3-flash-preview	Fast “frontier” model with strong grounding/search	$0.50 (text/image/video) $1.00 (audio)	$3.00 (text) (incl. thinking tokens)	Preview model
Gemini 2.5 Pro gemini-2.5-pro	Best for coding + complex reasoning	$1.25 (≤200k) $2.50 (>200k)	$10.00 (≤200k) $15.00 (>200k) (incl. thinking tokens)	Batch is discounted
Gemini 2.5 Flash gemini-2.5-flash	High volume, low latency, strong price/perf	$0.30 (text/image/video) $1.00 (audio)	$2.50 (incl. thinking tokens)	Batch: $0.15 in / $1.25 out
Gemini 2.5 Flash-Lite gemini-2.5-flash-lite	Cheapest at scale (classification, extraction)	$0.10 (text/image/video) $0.30 (audio)	$0.40 (incl. thinking tokens)	Great for “utility” jobs

Source: Gemini Developer API pricing. (This pricing page also lists image token assumptions like 560 tokens per image for certain pricing calculations.)

Cost Examples (Real Numbers)

Costs scale linearly with tokens. A quick mental model: cost ≈ (input_tokens / 1,000,000) × input_rate + (output_tokens / 1,000,000) × output_rate.

Example costs (approx.)
Scenario	Model	Tokens	Estimated cost
Summarize documents at scale	gemini-2.5-flash	50k input + 10k output	≈ (0.05 × $0.30) + (0.01 × $2.50) = $0.04
Complex reasoning / coding assist	gemini-2.5-pro	20k input + 10k output	≈ (0.02 × $1.25) + (0.01 × $10.00) = $0.13
Extraction / tagging (cheap utility)	gemini-2.5-flash-lite	30k input + 3k output	≈ (0.03 × $0.10) + (0.003 × $0.40) = $0.0042

Cost-Saving Levers (Batch, Caching, Grounding)

Batch pricing: Many models offer a discounted batch tier for non-urgent workloads (great for nightly processing).
Context caching: If you reuse large system prompts or documents, caching can reduce repeated input cost; you pay caching + storage rates.
Grounding: Search/Maps grounding is billed separately (per grounded prompt / search query), so treat it as a distinct budget line item.
Token discipline: Shorten outputs (limits), reduce unnecessary context, and chunk long docs intelligently.

FAQ

Why do “output” costs dominate so often?

Output tokens are typically priced higher than input tokens. For newer “thinking” models, the pricing tables may explicitly state that output pricing includes “thinking tokens”, which can increase total output billing if you allow large reasoning budgets.

How do long prompts affect pricing?

Historically, Gemini applied higher prices above certain context thresholds (e.g., 128K for Gemini 1.5; 200K for several current models). If your prompt crosses the threshold, the request can be billed at the higher tier.

Where can I verify today’s prices?

Always verify against the official pricing page: Gemini Developer API pricing. Major changes are usually announced on the Google Developers Blog.

Conclusion

The most useful way to understand Gemini API costs is to combine history (what changed and when) with the current rate card (what you pay today). Flash saw huge price drops for high-volume workloads in 2024, Pro pricing was reduced significantly for typical prompt sizes, and modern 2.5/3 models provide clear pricing tables including “thinking token” output accounting. If you run Gemini at scale, focus on (1) output control, (2) batch for offline workloads, and (3) caching for repeated context.

Gemini API Pricing History (2024–2026): How Token Costs Evolved + What You Pay Today

Quick Summary

Pricing Timeline (2024–2026)

Major Price Drops (Before/After Tables)

Current Pricing Snapshot (Gemini 2.5 + Gemini 3 Preview)

Cost Examples (Real Numbers)

Cost-Saving Levers (Batch, Caching, Grounding)

FAQ

Conclusion

About Zerlo

Links

Social Media