The Computing Series

Exercises

Level 1 — Understand

  1. What formula is used to convert events per day into QPS, and what standard approximation for seconds-per-day enables back-of-envelope estimation?
  2. Name the three read/write ratio classifications (read-heavy, write-heavy, balanced) and identify one architectural optimisation that is correct for each.
  3. What is the difference between strong consistency and eventual consistency? Give one example of a system requirement that demands strong consistency and one that can accept eventual consistency.

Level 2 — Apply

  1. A social photo-sharing app has 20M DAU. Each user views 50 photos per day and posts 1 photo per week. Average photo size is 3 MB. Calculate: (a) read QPS for photo serving, (b) write QPS for photo uploads, (c) daily storage added, (d) 3-year total storage. State whether the read/write ratio suggests a read-heavy or write-heavy architecture.

  2. A bank’s transaction service requires that account balances are always correct — a user must never see a balance that does not reflect all committed transactions. Classify this as strong or eventual consistency. What AT code describes the tradeoff the database must make to provide this guarantee? What FM code describes the failure if this guarantee is violated?

Level 3 — Design

  1. You are designing a real-time analytics dashboard for an e-commerce platform. Requirements: 500M events per day ingested (clicks, page views, purchases), dashboard queries must return in under 2 seconds, data must be accurate within 5 minutes of the event, retention is 2 years. Perform a full back-of-envelope estimation. Characterise the read/write profile. State the consistency requirement with AT code. Identify the two most significant architectural constraints that the numbers impose, and state one design decision that each constraint forces.

A complete answer will: (1) produce a correct back-of-envelope estimate — 500M events/day ≈ 5,800 writes/sec peak, 2-year retention at ~500 bytes/event ≈ 365 TB — and classify the workload as write-heavy ingest with read-heavy query (high read/write asymmetry), (2) state AT1 (Consistency/Availability) as the governing tradeoff and justify accepting eventual consistency (5-minute accuracy window) over strong consistency to avoid blocking ingest at 5,800 writes/sec, (3) name at least two FM codes that arise — FM3 (resource exhaustion on the ingest pipeline at sustained 5,800 writes/sec with traffic spikes) and FM6 (hotspot on time-series data for the current hour’s partition under bursty event load), and (4) map each constraint to a concrete design decision: the write-rate constraint forces a streaming ingest layer (e.g., Kafka) decoupled from the query store, and the 365 TB retention constraint forces tiered storage (hot store for recent data, cold object store for archival) with a stated partition or compaction strategy.

Read in the book →