Gemini API Pricing History (2024–2026): How Token Costs Evolved + What You Pay Today

Avatar
Lisa Ernst · 19.02.2026 · Artificial Intelligence · 9 min

Gemini’s pricing has changed meaningfully since the Gemini API went paid in 2024 — especially for high-volume “Flash” usage and long-context “Pro” workloads. If you build production features (agents, summarization pipelines, search grounding, coding assistants), understanding the price evolution matters as much as understanding today’s rate card.

Official current rates live here: Gemini Developer API pricing. This article focuses on the biggest price milestones (2024 → 2026) and then summarizes what you pay today.

Quick Summary

Tip: If your bill surprises you, it’s almost always because of output (and for “thinking” models, “thinking tokens” are counted in output pricing) or because long prompts cross a higher-tier threshold (historically: 128K; now commonly: 200K).

Pricing Timeline (2024–2026)

Google publishes current prices on the official pricing page and announces major changes on the Google Developers Blog. The table below highlights the most important milestones that affected real-world API spend.

Key pricing milestones
Effective What changed Impact Official reference
May 2024 Paid pricing published Baseline token prices for Gemini 1.5 Pro and Gemini 1.5 Flash were published for the paid tier. (Long prompts had a higher priced tier.) Archived pricing page snapshot
Aug 12, 2024 Flash price drop Massive reduction for Flash: input down to $0.075 / 1M, output to $0.30 / 1M (for prompts under 128K tokens), plus lower caching rates. Google Developers Blog (Aug 8, 2024)
Oct 1, 2024 Pro price drop Gemini 1.5 Pro got major reductions for prompts under 128K (and also lower long-context tier pricing). New under-128K: $1.25 input and $5.00 output per 1M tokens. Google Developers Blog (Sep 24, 2024)
2026 (today) 2.5 + 3 preview pricing Current pricing includes 2.5 family and Gemini 3 previews; some tables explicitly state output price (including thinking tokens). Current pricing page

Major Price Drops (Before/After Tables)

These two charts are the cleanest way to “see” the price evolution. They come straight from official Google announcements and show the old vs. new rates for the same model.

Gemini 1.5 Flash pricing: before vs after (effective Aug 12, 2024)

Source: developers.googleblog.com

Gemini 1.5 Flash: large price drop effective Aug 12, 2024 (official chart). It shows input/output + caching changes for both ≤128K and >128K tiers.

Gemini 1.5 Pro pricing: before vs after (effective Oct 1, 2024)

Source: developers.googleblog.com

Gemini 1.5 Pro: price reduction effective Oct 1, 2024 (official chart). Under 128K prompts dropped from $3.50 → $1.25 input and $10.50 → $5.00 output per 1M tokens.

Gemini 1.5 price evolution (USD per 1M tokens)
Model Effective Tier Input Output Source
Gemini 1.5 Flash May 2024 ≤128K tokens $0.35 $0.53 Archived pricing page
Gemini 1.5 Flash May 2024 >128K tokens $0.70 $1.05 Archived pricing page
Gemini 1.5 Flash Aug 12, 2024 ≤128K tokens $0.075 $0.30 Google Developers Blog
Gemini 1.5 Flash Aug 12, 2024 >128K tokens $0.15 $0.60 Google Developers Blog
Gemini 1.5 Pro May 2024 ≤128K tokens $3.50 $10.50 Archived pricing page
Gemini 1.5 Pro May 2024 >128K tokens $7.00 $21.00 Archived pricing page
Gemini 1.5 Pro Oct 1, 2024 ≤128K tokens $1.25 $5.00 Google Developers Blog
Gemini 1.5 Pro Oct 1, 2024 >128K tokens $2.50 $10.00 Google Developers Blog

Note: Gemini 1.5 models had an even larger context window (up to 2M tokens) and were frequently repriced as the product matured. For current production work, use the pricing page for today’s supported models.

Current Pricing Snapshot (Gemini 2.5 + Gemini 3 Preview)

Below is a “today view” (token prices per 1M tokens, USD) pulled from the official pricing page. Pay attention to the wording “output price (including thinking tokens)” on newer models.

Current token prices (USD per 1M tokens)
Model Use case Input price Output price Notes
Gemini 3 Pro Preview
gemini-3-pro-preview
Most powerful multimodal + agentic workloads $2.00 (≤200k)
$4.00 (>200k)
$12.00 (≤200k)
$18.00 (>200k)
(incl. thinking tokens)
Preview model
Gemini 3 Flash Preview
gemini-3-flash-preview
Fast “frontier” model with strong grounding/search $0.50 (text/image/video)
$1.00 (audio)
$3.00 (text)
(incl. thinking tokens)
Preview model
Gemini 2.5 Pro
gemini-2.5-pro
Best for coding + complex reasoning $1.25 (≤200k)
$2.50 (>200k)
$10.00 (≤200k)
$15.00 (>200k)
(incl. thinking tokens)
Batch is discounted
Gemini 2.5 Flash
gemini-2.5-flash
High volume, low latency, strong price/perf $0.30 (text/image/video)
$1.00 (audio)
$2.50
(incl. thinking tokens)
Batch: $0.15 in / $1.25 out
Gemini 2.5 Flash-Lite
gemini-2.5-flash-lite
Cheapest at scale (classification, extraction) $0.10 (text/image/video)
$0.30 (audio)
$0.40
(incl. thinking tokens)
Great for “utility” jobs

Source: Gemini Developer API pricing. (This pricing page also lists image token assumptions like 560 tokens per image for certain pricing calculations.)

Cost Examples (Real Numbers)

Costs scale linearly with tokens. A quick mental model: cost ≈ (input_tokens / 1,000,000) × input_rate + (output_tokens / 1,000,000) × output_rate.

Example costs (approx.)
Scenario Model Tokens Estimated cost
Summarize documents at scale gemini-2.5-flash 50k input + 10k output ≈ (0.05 × $0.30) + (0.01 × $2.50) = $0.04
Complex reasoning / coding assist gemini-2.5-pro 20k input + 10k output ≈ (0.02 × $1.25) + (0.01 × $10.00) = $0.13
Extraction / tagging (cheap utility) gemini-2.5-flash-lite 30k input + 3k output ≈ (0.03 × $0.10) + (0.003 × $0.40) = $0.0042

Cost-Saving Levers (Batch, Caching, Grounding)

FAQ

Why do “output” costs dominate so often?

Output tokens are typically priced higher than input tokens. For newer “thinking” models, the pricing tables may explicitly state that output pricing includes “thinking tokens”, which can increase total output billing if you allow large reasoning budgets.

How do long prompts affect pricing?

Historically, Gemini applied higher prices above certain context thresholds (e.g., 128K for Gemini 1.5; 200K for several current models). If your prompt crosses the threshold, the request can be billed at the higher tier.

Where can I verify today’s prices?

Always verify against the official pricing page: Gemini Developer API pricing. Major changes are usually announced on the Google Developers Blog.

Conclusion

The most useful way to understand Gemini API costs is to combine history (what changed and when) with the current rate card (what you pay today). Flash saw huge price drops for high-volume workloads in 2024, Pro pricing was reduced significantly for typical prompt sizes, and modern 2.5/3 models provide clear pricing tables including “thinking token” output accounting. If you run Gemini at scale, focus on (1) output control, (2) batch for offline workloads, and (3) caching for repeated context.

Share our post!
Sources