Exercises

Level 1 — Understand

State the one structural difference between a DP memo table and an infrastructure cache. What engineering problem does this difference create?
What is the thundering herd problem? At what moment does it occur? Name two techniques that prevent it.
Why does LRU eviction work well for social media profiles but poorly for batch data processing? What access pattern assumption does LRU rely on?
A memoized API response is stale. The user sees yesterday’s data. Name the FM code and the AT code that governs the tradeoff. (Hint: FM1 = Single Point of Failure; AT1 = Consistency vs. Availability.)

Level 2 — Apply

A web application serves user timelines. Each timeline is computed by querying the 50 most recent posts from all the user’s followers. The computation takes 200ms. The timeline is cached with a 5-minute TTL.

User Alice has 10,000 followers. She posts a new update. How long until the cached timeline reflects her post?
Alice’s post goes viral. 50,000 requests for her followers’ timelines arrive within 30 seconds of her post. The cache expires at this moment. Describe the thundering herd event that occurs.
Propose three modifications to the caching strategy that address the thundering herd while maintaining < 10-second staleness for new posts.

Level 3 — Design

A recommendation service generates personalised product recommendations for 50 million users. Each recommendation list is computed by a machine learning model that takes 2 seconds per user. Users browse the platform 3 times per day on average; their interests change slowly (days to weeks).

Design the caching strategy: what to cache, what TTL to use, what invalidation strategy. Justify each choice.
A user makes a purchase. Their recommendations should update within 60 seconds to reflect the purchase. How does this change your invalidation strategy?
The ML model is retrained weekly with new data. On retraining, all 50 million cached recommendations become stale simultaneously. How would you handle the cache warm-up after retraining without causing a thundering herd? Name every tradeoff using AT notation.

A complete answer will: (1) design the base caching strategy with TTL justified by the “interests change slowly” property and an invalidation mechanism triggered by purchases, (2) identify FM7 (thundering herd) as the failure mode when 50 million cache entries expire simultaneously after model retraining, (3) address the AT4 tradeoff between pre-warming the cache proactively (avoids herd, costs compute) and lazy population on request (low upfront cost, risks herd), and (4) propose a concrete warm-up strategy — such as staggered pre-computation by user segment with a jitter window — that keeps database load below capacity during the transition.

Read in the book →

← Real Systems