State the seven questions and their correct order from memory. For each question, name the failure mode it is designed to catch before production.
Apply all seven questions to a simple key-value store API with
two endpoints: GET /values/{key} and
PUT /values/{key}. What is your answer to each
question?
A team skips Q7 because “we can add monitoring later.” Using the Q4 → Q5 → Q6 → Q7 dependency chain, explain concretely why this decision makes the previous three questions harder to answer after launch.
Q2 and Q3 overlap: some failures trace directly to unprotected state. Give an example where the answer to Q3 reveals a failure mode that Q2 missed, and explain why the ordering matters.
You inherit a three-year-old system. You have four hours to understand it well enough to handle on-call. In what order would you apply the seven questions, and would you change the order from the standard sequence? Explain your reasoning.
Concept: F5 — The 7 Architecture Review Questions
Thread: T12 (Tradeoffs) ← Systematic debugging (Book 1, Ch 1) → Architecture review process (Book 6, Ch 5)
Core Idea: Seven questions — Scale, Failure, State, Latency, Evolution, Security, Observability — applied in dependency order form a complete review of any system’s failure surface. They were derived from what architecture reviews consistently missed and production incidents consistently found.
Tradeoff: Correctness vs Performance (F4 #9) — answering the questions thoroughly takes time; skipping questions saves time now and costs it in incidents later
Failure Mode: FM11 (Observability Blindness) — Q7 is the most commonly skipped; it is also the failure mode that makes every other failure mode worse
Signal: Any system design proposal; any pre-production review; any architecture you are inheriting; any incident post-mortem
Maps to: Reference Book Ch 8 (F4 Tradeoffs — each question surfaces tradeoffs); Book 4 Ch 1 (system design methodology); Book 6 Ch 5 (architecture review process)