Calculator

RAG Inference Cost
calculator.

Estimate monthly inference cost for a RAG system in production.

How we calibrated this

Used to model client RAG TCO before architecture decisions.

Inputs

Tell us about your project.

This is a static reference card. For interactive calculators, talk to us — we tune the assumptions per client.

Queries per month

Range: 1000–5000000 queries · Default: 50000 queries

Avg input tokens per query

Range: 500–20000 tokens · Default: 4000 tokens

Avg output tokens per query

Range: 100–4000 tokens · Default: 600 tokens

Model

How it's calculated

Tokens × per-token model price + retrieval costs

Output

API + retrieval cost per month.

Output

Effective unit economics.

Output

12-month projection.

Want a real estimate?

For a real estimate calibrated to your specific project, brief us. We get back within two business days.

Other calculators

Brief us in three sentences. We'll send a tailored estimate.