AT4 Precompute vs On-Demand: The Cache Question

A team is building a leaderboard for a multiplayer game. Two designs are on the table.

Design A: Every time a player's score changes, rebuild the leaderboard and store it. Reads are O(1) — just return the stored top-100. Writes are expensive — every score update triggers a re-sort across millions of players.

Design B: Store individual player scores. When the leaderboard is requested, compute the top-100 on demand. Writes are cheap — just update one row. Reads are expensive — every leaderboard request sorts millions of rows.

Neither is "correct." Each is a point on a dial. The dial is AT4 — Precomputation vs On-Demand, and it is the question behind every cache, every search index, every materialised view, and every news feed in production.

The Dial

Pay the cost of computation at write time (precomputation) or at read time (on-demand).

Precomputation makes reads fast at the cost of increased write cost and storage.
On-demand makes writes cheap at the cost of slower reads.

That is the whole tradeoff. The art is in deciding which side to land on for your workload.

Where AT4 Shows Up

AT4 is one of the most pervasive tradeoffs in distributed systems because it shows up at every layer:

News feed fan-out. A feed that precomputes each user's feed on every post (fan-out-on-write) gives fast reads, expensive writes. A feed that assembles the feed from followed accounts at read time (fan-out-on-read) gives cheap writes, slow reads. The hybrid model — fan-out-on-write for regular users, fan-out-on-read for celebrities — is AT4 applied per-user based on the cost calculus.

Search infrastructure. A precomputed search index (Elasticsearch, Lucene) gives fast searches at the cost of indexing latency and storage. A full-table scan gives zero indexing overhead and slow searches. Every search system is a point on this dial. The decision of which fields to index is AT4 applied at field granularity.

Database materialised views. A materialised view precomputes the result of a query. Reads against the view are O(1) lookups. Writes against the underlying tables must update the view. PostgreSQL's MATERIALIZED VIEW is the explicit version. Caching layer rebuilt by Kafka consumers is the same pattern at infrastructure scale.

Caching. Every cache is precomputation. The cache stores the result of a computation that was originally on-demand. Reads from the cache are fast; the original on-demand cost is paid only on cache miss. The cache TTL is a knob that mediates between freshness (more on-demand) and read latency (more precomputed).

Image/video transcoding. Precompute all variants at upload (fast streaming, large storage); transcode on first request (slow first request, small storage); something in between (a small set of common variants precomputed, rare variants generated on demand).

The pattern repeats at every layer because the underlying constraint — that computation costs money and must be paid either at write or at read — is universal.

Setting the Dial

The decision is governed by the read-to-write ratio for the specific data.

Read-heavy systems with expensive computation → toward precomputation. The cost is paid once per write and amortised across many reads. A feed that is loaded 100 times per day but updated 5 times per day is a clear candidate. The amortisation math wins.

Write-heavy systems with infrequent reads → toward on-demand. Paying the precomputation cost for data that is rarely read wastes resources. An audit log that is written constantly but queried only during incidents is a clear candidate for on-demand.

The boundary case is where read-to-write is close to 1:1. Here precomputation rarely pays off. Compute on demand, cache the result with a short TTL, and accept that the cache is a tactical optimisation rather than a structural decision.

A useful heuristic: if reads outnumber writes by 10× or more, precomputation almost certainly pays. If reads outnumber writes by less than 2×, on-demand is the default. The 2–10× zone is where measurement matters.

The Storage Consequence

Precomputation always increases storage. This is the part teams forget.

The precomputed search index takes more space than the raw documents. The precomputed feed takes more space than storing only posts. The materialised view takes more space than the source tables. The cache takes more space than the database.

For a system at small scale, this is negligible. For a system at large scale, it can dominate the operating cost. A precomputed feed for 100M users at 1KB per feed is 100GB. A precomputed feed for 1B users at 1KB per feed is 1TB — and it has to live in memory (Redis) to deliver O(1) reads. The infrastructure cost of that storage is not optional once you commit to precomputation.

Storage cost is part of the AT4 tradeoff. Cost the storage before deciding.

Where AT4 Fails

Staleness — the precomputation failure mode. Precomputed data is a snapshot. Between the moment of precomputation and the next refresh, the data is stale. If the underlying state changes during that window, the precomputed view diverges from reality. For a leaderboard that updates every 30 seconds, the staleness window is bounded and tolerable. For an inventory count that drives purchase decisions, even one second of staleness causes orders to be accepted for items that no longer exist. The staleness budget must be designed in, not hoped for.

Cold cache — the on-demand failure mode under load. A system designed for on-demand reads with a "caching layer for hot keys" assumes the cache has most queries. When the cache is cold (after a restart, after eviction, after a deploy that flushed it), every request misses and hits the slow on-demand path simultaneously. This is FM7 (Thundering Herd) — the cold cache produces the failure mode the cache was meant to prevent.

Cache invalidation — the cross-cutting failure mode. When the underlying data changes, precomputed views must be invalidated or refreshed. Invalidation is famously one of the two hard problems in computer science. The patterns: TTL-based (simple, but stale data lives until TTL expires), event-driven (fast, but every write must publish to every interested cache), write-through (consistent, but every write blocks on the cache update). Each has a different failure mode.

The Question Before Adding a Cache

When someone proposes "let's cache this" — which they will, often — the AT4-aware response is to ask:

What is the read-to-write ratio? (If reads do not dominate writes by at least 5×, the cache is probably not worth the invalidation complexity.)
What is the staleness budget? (If "must always be current," precomputation is wrong — find a different optimisation.)
Who invalidates the cache when the source changes? (If the answer is "we'll figure it out later," see FM4.)
What is the cold-start behaviour? (If the cache is on the critical path and the system cannot serve cold, that is a deployment-time outage waiting to happen.)
Have you costed the storage? (Most teams under-cost cache storage by an order of magnitude.)

Caching is not a feature you add. It is a structural choice on the AT4 dial. The cost is paid in writes, storage, invalidation complexity, and staleness. Pay it when the math works, not because dashboards are slow.

The Related Tradeoff Most Teams Conflate

AT4 is sometimes conflated with AT2 (Latency vs Throughput). They are related but not the same. Precomputation improves latency for the read but reduces write throughput. On-demand improves write throughput but increases read latency. They are different axes of the same cost equation.

When you face a system that is both read-slow and write-throughput-constrained, AT4 does not save you. You have to do harder architectural work — sharding (Chapter 4), replication (Chapter 8), event-driven materialisation (Chapter 14). AT4 is the lens for the simpler decision: "where do we pay for this computation?"

The AT4 dial is defined in The Engineer's Map — Framework 4 (Book 0, Chapter 8). This article extracts it in the context where it bites hardest: Book 4, Chapter 4 — Distributed Cache. The book4 chapter walks through cache placement, write-through vs write-behind, the eviction policy decision, the cold-start problem, and the cache-invalidation patterns that determine whether your AT4 choice survives production.

Read Book 4, Chapter 4 →