The interviewer slides a blank whiteboard marker across the table and says: “Design Twitter.” You have forty-five minutes. Most engineers reach for the marker immediately and start drawing boxes. They draw a server, a database, a load balancer — the things they know. Twenty minutes in, the interviewer asks what happens when a celebrity with fifty million followers posts. The boxes stop making sense.
System design is not a drawing exercise. It is a reasoning exercise that happens to produce drawings. The failure mode is not drawing the wrong boxes — it is drawing boxes before understanding what the system must do, at what scale, under which failure conditions, and how it will change over time. This chapter establishes a discipline for approaching any system design problem from first principles, using the frameworks built throughout this series.
Given a product requirement — “build a URL shortener”, “design a notification system”, “scale the checkout service” — produce an architecture that satisfies functional requirements at stated scale, degrades gracefully under failure, and can evolve as requirements change. The deliverable is not a single diagram but a reasoned sequence of decisions, each with explicit tradeoffs.
The constraints are: limited time, incomplete information, and the impossibility of optimising all dimensions simultaneously.
The naive approach is pattern matching. “This sounds like Uber, so I’ll draw the Uber architecture.” Pattern matching produces architectures that look correct but collapse under questioning because the designer cannot explain why each component exists.
Consider a caching layer added because “caching makes things fast.” When asked whether it uses write-through or write-behind, what the eviction policy is, how it behaves when the cache node fails, or whether consistency is required — the pattern-matched answer cannot respond. At scale, unjustified decisions become production incidents.
The deeper failure: pattern matching treats scale as an afterthought. Real systems grow in non-uniform ways. Read load and write load grow at different rates. User distribution shifts geographically. One feature drives ten times more traffic than expected. An architecture that was not designed for evolution requires a rewrite to accommodate growth.
The disciplined approach has five phases, executed in order.
Phase 1: Clarify Requirements
Distinguish functional requirements (what the system does) from non-functional requirements (how well it does it). Functional: users can post messages, followers see those messages in a feed. Non-functional: feed load latency under 100ms at the 99th percentile, 99.99% availability, five-year data retention.
Write both down explicitly before drawing anything. The F5 Review Questions framework provides a checklist: Who are the users? What are the read and write patterns? What consistency level is required? What are the latency targets? What are the failure tolerance requirements? What is the data model? What are the scale targets?
Phase 2: Estimate Scale
Back-of-envelope estimation converts product requirements into engineering constraints.
DAU = 50M users
Average posts per user per day = 2
Write QPS = (50M × 2) / 86400 ≈ 1,200 writes/sec
Read/write ratio = 100:1 (typical feed system)
Read QPS = 120,000 reads/sec
Average post size = 280 chars + metadata ≈ 1 KB
Daily write volume = 50M × 2 × 1 KB = 100 GB/day
5-year storage = 100 GB × 365 × 5 ≈ 180 TB
Scale estimates determine which architectural decisions are forced. 120,000 read QPS cannot be served by a single database. 180 TB cannot fit on a single server. Estimation is not about precision — it is about revealing constraints that eliminate naive solutions.
Phase 3: Design Components
Map requirements to the seven system archetypes (F6): Read-Heavy Store, Write-Heavy Ingest, Search & Discovery, Real-Time Messaging, Media Delivery, Marketplace, Data Intelligence. Most real systems combine two or three archetypes. Identify the primary archetype first, then the secondary.
Use the five Architecture Diagram types (F7) in sequence: Data Model (what is stored and how), Component Diagram (which services exist), Sequence Diagram (how a request flows), Data Flow Diagram (where data moves and transforms), Deployment Diagram (how services map to infrastructure).
Phase 4: Evaluate Tradeoffs
For each major design decision, name the tradeoff explicitly using AT codes. “We use eventual consistency here” is incomplete. “We accept AT1 (Consistency/Availability) in favour of availability because reads must not block on cross-region replication” is a decision.
Every decision has a cost. The cost of eventual consistency is FM4 (Data Consistency Failure) — the risk that a user sees stale data. Name the failure mode and state the mitigation.
Phase 5: Plan Evolution
Every system will be different in two years. Design for the next ten times, not the next thousand times. Identify the components most likely to become bottlenecks first. Sketch what changes at 10× scale. The components that require replacement are the ones that need clean boundaries today.
Current scale → 10× scale change
Single DB → Read replicas + write sharding
In-process cache → Distributed cache cluster
Synchronous calls → Async queue-based processing
Single region → Multi-region with geo-routing
AT3 — Simplicity/Flexibility: Start with the simplest architecture that satisfies current requirements. Every layer of abstraction added pre-emptively is complexity that must be maintained before it is needed. Add flexibility at the boundaries most likely to change.
AT10 — Tradeoffs as first-class outputs: The deliverable of system design is not an architecture diagram but a set of documented tradeoff decisions. An architecture without documented decisions cannot be maintained because the team cannot know what will break when they change something.
AT5 — Centralisation/Distribution: Every component that exists as a single instance is a SPOF (FM1). Identify which centralised components are acceptable risks and which require distribution before the design is complete.
FM1 — Single Point of Failure: Any component with no redundancy. The mitigation is active/passive or active/active redundancy. Identifying SPOFs requires listing every component that, if it fails, takes the system down.
FM11 — Observability Blindness: A system that cannot be observed cannot be debugged in production. Metrics, logs, and traces must be designed in from the start, not bolted on after the first outage. Signal: the team cannot answer “how many requests failed in the last five minutes” without a code deployment.
FM2 — Cascading Failures: A failure in one component propagates to others because dependencies are not isolated. Circuit breakers and bulkheads are the mitigations. Any synchronous call between services without a timeout or circuit breaker is a cascade risk.
At 10× the initial load, the bottleneck is almost always the database. The first evolution is read replicas. At 100× load, the bottleneck moves to the read replicas themselves, requiring sharding. Each evolution step changes the data access patterns and may invalidate caching strategies.
The design process does not change at scale — the same five phases apply. The difference is that scale estimates change which solutions are on the table. What is a single-server solution at 1,000 QPS becomes a distributed system problem at 100,000 QPS.
Google System Design Interviews canonised the five-phase approach. Their public engineering blog documents the tradeoffs behind Bigtable, Spanner, and Chubby — each a worked example of this process.
Amazon’s Working Backwards starts from the press release (requirements) before any technical design. This is Phase 1 extended into product thinking.
Stripe’s Architecture Reviews require explicit documentation of the tradeoff space before any major technical decision. The output is an RFC (Request for Comments) that names the alternatives considered and why each was rejected — structured tradeoff documentation.
Netflix’s Chaos Engineering extends Phase 5 (evolution) into production: if you cannot reason about failure modes in advance, inject failures until you understand the system’s actual behaviour.
Concept: System Design Process
Thread: T12 (Tradeoffs) ← Book 3, Ch 1 → Ch 2 (Requirements to Architecture)
Core Idea: System design is a five-phase reasoning process — clarify, estimate, design, evaluate tradeoffs, plan evolution — not a pattern-matching exercise. Each phase produces explicit, documentable outputs.
Tradeoff: AT3 — Simplicity vs Flexibility: start with the simplest architecture that satisfies current requirements; add flexibility only at the boundaries most likely to change.
Failure Mode: FM11 — Observability Blindness: systems designed without observability cannot be debugged in production.
Signal: When you are asked to design any system from scratch and must produce a defensible architecture under time pressure.
Maps to: Book 0, Framework 6 (System Archetypes)
A product manager gives you this requirement: “Users should be able to share photos and see photos from people they follow, with a home feed updating in near-real-time.” Write out the functional requirements and non-functional requirements as separate lists. Then estimate: if there are 10M DAU and each user views 20 photos per day, what is the read QPS? If each user posts one photo per week, what is the daily storage requirement assuming 2 MB average photo size?
For a simple URL shortener at 1M QPS reads and 1,000 QPS writes, identify: (a) the primary system archetype from F6, (b) which component will be the first bottleneck, (c) what AT code describes the decision to use eventual consistency for click-count statistics.
A complete answer will: (1) correctly walk through all five design phases — requirements, estimation, high-level design, detailed design, and tradeoff analysis — with the leaderboard problem as the thread, producing a concrete write rate estimate (10M players / 30s = 333,000 score updates/sec), (2) name at least three AT codes and three FM codes that arise in the design — at minimum AT5 (centralised sorted set vs distributed shards), FM6 (hotspot on the global top-100 rank), and FM3 (resource exhaustion on the score update queue at 333K/sec), (3) identify the single most important architectural decision (pre-computed top-100 list refreshed periodically vs real-time sorted set) with its latency/accuracy cost stated, and (4) address the friend-graph leaderboard separately — how it differs from the global leaderboard in data access patterns — and propose a concrete mechanism (e.g., social graph traversal at query time vs pre-computed per-user friend leaderboard) with its AT tradeoff code.