Exercises

Level 1 — Understand

Name the five phases of the disciplined system design process and the primary output of each phase.
What is the difference between functional and non-functional requirements? Give one example of each for a URL shortener.
What FM code describes the risk of a system that cannot answer “how many requests failed in the last five minutes” without a code deployment, and what does that code stand for?

Level 2 — Apply

A product manager gives you this requirement: “Users should be able to share photos and see photos from people they follow, with a home feed updating in near-real-time.” Write out the functional requirements and non-functional requirements as separate lists. Then estimate: if there are 10M DAU and each user views 20 photos per day, what is the read QPS? If each user posts one photo per week, what is the daily storage requirement assuming 2 MB average photo size?
For a simple URL shortener at 1M QPS reads and 1,000 QPS writes, identify: (a) the primary system archetype from F6, (b) which component will be the first bottleneck, (c) what AT code describes the decision to use eventual consistency for click-count statistics.

Level 3 — Design

You are designing a real-time leaderboard for a mobile game. 10M concurrent players, score updates every 30 seconds per player, leaderboard must show top 100 globally and top 100 among friends. Walk through all five phases of the design process. Name every tradeoff with an AT code and every failure risk with an FM code. What is the single most important architectural decision, and what is its cost?

A complete answer will: (1) correctly walk through all five design phases — requirements, estimation, high-level design, detailed design, and tradeoff analysis — with the leaderboard problem as the thread, producing a concrete write rate estimate (10M players / 30s = 333,000 score updates/sec), (2) name at least three AT codes and three FM codes that arise in the design — at minimum AT5 (centralised sorted set vs distributed shards), FM6 (hotspot on the global top-100 rank), and FM3 (resource exhaustion on the score update queue at 333K/sec), (3) identify the single most important architectural decision (pre-computed top-100 list refreshed periodically vs real-time sorted set) with its latency/accuracy cost stated, and (4) address the friend-graph leaderboard separately — how it differs from the global leaderboard in data access patterns — and propose a concrete mechanism (e.g., social graph traversal at query time vs pre-computed per-user friend leaderboard) with its AT tradeoff code.

Read in the book →

← Real-World Variants