RAG chatbot cost calculator

Estimate retrieval-heavy chatbot costs across prompt context, generated answers, cache hit rate, and model choice.

Last updated:

Scenario calculator

Calculate and compare

Retrieval-heavy chatbot with cacheable context.

requests / month
tokens / request
tokens / request
Estimated monthly cost$45.04

Gemini 3.5 Flash

Annual budget$540.44

Based on 12 months

Cost per user$0.0038

Per active user / month

Cost per task$0.0005

Per modeled task

Cheapest alternativeGemini 3.5 Flash

$45.04 / month

Best value candidateGemini 3.5 Flash

$45.04 / month

Data last verified Jun 26, 2026 Source status visible on every result Watchlist prices are not procurement advice

Gemini 3.5 Flash is watchlist; verify sources before procurement. Cached input pricing is not confirmed for this model and is excluded from the estimate. Llama 3.1 70B Instruct is aggregator verified; verify sources before procurement.

Model cost comparison for this scenario
ModelProviderSource statusInput / 1MOutput / 1MMonthly costCost / task
Gemini 3.5 FlashGoogleWatchlist$0.08$0.30$45.04$0.0005
Llama 3.1 70B InstructMeta / Hosted APIsAggregator verified$0.59$0.79$237.63$0.0024
GPT-4oOpenAIOfficial stale$2.5$10$1,550.15$0.0155
Claude 3.5 SonnetAnthropicOfficial stale$3$15$1,967.09$0.0197

Cache impact: 20% ยท selected models: 4

Hidden cost checklist

  • Retrieved context tokens
  • Embedding and reranking costs
  • Cache misses
  • Quality review loops

Related planning pages

Compare this scenario against the main AI model cost calculator, provider pages, and the pricing change log before locking a production budget.

RAG chatbot cost calculator FAQ

What assumptions does the RAG chatbot cost calculator use?

The default preset starts with 100,000 monthly requests, 3,800 input tokens, and 650 output tokens per request.

Why does source status matter?

AI model prices change quickly. ModelMeter keeps watchlist, stale, aggregator, and official records visibly separate so estimates do not look more certain than the sources allow.

Get AI model pricing alerts

Weekly digest plus price change alerts for the models and providers you care about.

We respect your privacy. No spam, ever.