Last update
Cache Components don't just improve UX; they can also lower operating costs by reducing work during request time.
What is Really Saved
In a granular architecture, the main savings usually come from:
- Fewer queries at request time
- Less render CPU per request
- Lower pressure on traffic spikes
- Better cache hit ratio on stable content
The goal is not "zero queries", but rather to move repetitive request-time
work into cached content controlled by cacheLife + tags.
Simple Mental Model
Think of two layers:
- Build/revalidation work: Amortized cost
- Per-request work: Variable cost (scales with traffic)
When traffic grows, variable cost dominates. That's why reducing per-request work usually has a greater impact on final cost.
Quick Comparison
async function ProductPage({ id }) {
const product = await db.query('SELECT * FROM products WHERE id = ?', [id])
return <UI product={product} />
}- 1 large query per request
- Full render per request
- Lower initial complexity
export default function ProductPage({ params }) {
return (
<Suspense fallback={<PageSkeleton />}>
<ProductContent params={params} />
</Suspense>
)
}
async function ProductText({ id }) {
'use cache'
cacheTag(`product-text-${id}`)
cacheLife('weeks')
return db.getProductText(id)
}
async function ProductPrice({ id }) {
'use cache'
cacheTag(`product-price-${id}`)
cacheLife('hours')
return db.getProductPrice(id)
}
async function ProductStock({ id }) {
return db.getProductStock(id)
}- Cached text and price
- Only stock remains fresh
- More control over what invalidates and when
Useful Formulas for Estimating Savings
If you define:
R= requests per dayQ_all= queries/request in the traditional approachQ_rt= queries/request that remain uncached (request time)
Then:
- Traditional request-time queries =
R * Q_all - Granular request-time queries =
R * Q_rt - Request-time savings =
R * (Q_all - Q_rt)
Numerical Example
R = 1,000,000- Traditional:
Q_all = 1(complete query) - Granular:
Q_rt = 1(stock only)
In this example, you don't reduce the number of queries/request, but you do:
- Reduce the size and cost of the request-time query
- Serve part of the HTML already resolved
- Improve perceived TTFB through static shell + streaming
If in your traditional case you had several request-time queries and in the granular case you only leave one small one, the savings grow significantly.
Business Benefit (Educational)
To explain savings in a demo, speak in terms of:
- Variable cost per request (DB + CPU)
- Amortized cost per revalidation
- Perceived latency (user sees useful content sooner)
For products with high traffic and mostly stable content, the granular pattern usually reduces variable cost and improves experience at the same time.
Where It Can Be More Expensive
Not everything should be cached granularly. It can worsen if:
- You fragment into too many components unnecessarily
- You design ambiguous tags or tags that are difficult to invalidate
- You revalidate excessively for very frequent events
Rule of Thumb
- Cache granularly only fields with different volatility
- Use the longest acceptable
cacheLifeprofile - Prefer invalidation by specific tags
Implementation Checklist
-
cacheComponents: trueenabled - Components cached with
'use cache' -
cacheTagper functional unit (e.g.,product-price-${id}) -
revalidateTag/updateTagbased on required consistency - Runtime data inside
Suspense - Measurement with logs and headers (
x-nextjs-cache)
What to Measure in Production
Minimum recommended metrics:
- P95/P99 latency on critical pages
- HIT/MISS/STALE ratio per route
- Queries per request (average and percentiles)
- CPU per request in SSR
With these 4 metrics, you can justify cost savings with data, not just theory.
Case - Shared Promise
One query, one promise, and Suspense per block
Rendering Guide in Next.js: SSR, SSG, ISR, and PPR
Master rendering strategies in Next.js. Learn when to use Server-Side Rendering, Static Generation, Incremental Regeneration, and Partial Prerendering to optimize performance and SEO.