Benefits and Costs

Cache Components don't just improve UX; they can also lower operating costs by reducing work during request time.

What is Really Saved

In a granular architecture, the main savings usually come from:

Fewer queries at request time
Less render CPU per request
Lower pressure on traffic spikes
Better cache hit ratio on stable content

The goal is not "zero queries", but rather to move repetitive request-time work into cached content controlled by cacheLife + tags.

Simple Mental Model

Think of two layers:

Build/revalidation work: Amortized cost
Per-request work: Variable cost (scales with traffic)

When traffic grows, variable cost dominates. That's why reducing per-request work usually has a greater impact on final cost.

Quick Comparison

async function ProductPage({ id }) {
  const product = await db.query('SELECT * FROM products WHERE id = ?', [id])
  return <UI product={product} />
}

1 large query per request
Full render per request
Lower initial complexity

export default function ProductPage({ params }) {
  return (
    <Suspense fallback={<PageSkeleton />}>
      <ProductContent params={params} />
    </Suspense>
  )
}

async function ProductText({ id }) {
  'use cache'
  cacheTag(`product-text-${id}`)
  cacheLife('weeks')
  return db.getProductText(id)
}

async function ProductPrice({ id }) {
  'use cache'
  cacheTag(`product-price-${id}`)
  cacheLife('hours')
  return db.getProductPrice(id)
}

async function ProductStock({ id }) {
  return db.getProductStock(id)
}

Cached text and price
Only stock remains fresh
More control over what invalidates and when

Useful Formulas for Estimating Savings

If you define:

R = requests per day
Q_all = queries/request in the traditional approach
Q_rt = queries/request that remain uncached (request time)

Then:

Traditional request-time queries = R * Q_all
Granular request-time queries = R * Q_rt
Request-time savings = R * (Q_all - Q_rt)

Numerical Example

R = 1,000,000
Traditional: Q_all = 1 (complete query)
Granular: Q_rt = 1 (stock only)

In this example, you don't reduce the number of queries/request, but you do:

Reduce the size and cost of the request-time query
Serve part of the HTML already resolved
Improve perceived TTFB through static shell + streaming

If in your traditional case you had several request-time queries and in the granular case you only leave one small one, the savings grow significantly.

Business Benefit (Educational)

To explain savings in a demo, speak in terms of:

Variable cost per request (DB + CPU)
Amortized cost per revalidation
Perceived latency (user sees useful content sooner)

For products with high traffic and mostly stable content, the granular pattern usually reduces variable cost and improves experience at the same time.

Where It Can Be More Expensive

Not everything should be cached granularly. It can worsen if:

You fragment into too many components unnecessarily
You design ambiguous tags or tags that are difficult to invalidate
You revalidate excessively for very frequent events

Rule of Thumb

Cache granularly only fields with different volatility
Use the longest acceptable cacheLife profile
Prefer invalidation by specific tags

Implementation Checklist

cacheComponents: true enabled
Components cached with 'use cache'
cacheTag per functional unit (e.g., product-price-${id})
revalidateTag / updateTag based on required consistency
Runtime data inside Suspense
Measurement with logs and headers (x-nextjs-cache)

What to Measure in Production

Minimum recommended metrics:

P95/P99 latency on critical pages
HIT/MISS/STALE ratio per route
Queries per request (average and percentiles)
CPU per request in SSR

With these 4 metrics, you can justify cost savings with data, not just theory.

On this page