Applying the Map: A Worked Example

To see the nine frameworks operate as a unified system rather than nine separate checklists, consider one concrete problem: designing a notification delivery service. The service accepts notification requests (email, SMS, push) from other services and delivers them reliably, with rate limiting, retry logic, and delivery tracking.

F1 — Mental Models: What kind of thing is this?

Two models dominate. First, Flow — notifications move through the system from request to delivery, and the service must handle backpressure when downstream providers (email gateways, SMS APIs) are slow. Second, Feedback — delivery receipts and bounce notifications feed back into the system, affecting retry logic and sender reputation. The flow model tells us to think about throughput and queue depth. The feedback model tells us to think about closed loops and adaptation.

F2 — Engineering Principles: What cannot change?

Idempotency is non-negotiable. A retry must not produce a duplicate notification — users who receive the same SMS three times will disable notifications entirely. Fault tolerance is load-bearing: the service must continue accepting requests even when one delivery provider is down. These are not aspirational properties. They are constraints that the architecture must guarantee structurally, not hope for operationally.

F3 — Failure Modes: What can go wrong?

FM2 (Cascading Failure) is the primary exposure. If the email provider slows down, the queue grows, memory pressure increases, and the service stops processing SMS and push notifications too — a failure in one channel cascades to all channels. FM7 (Thundering Herd) appears after an outage recovery: if the email provider comes back after a 30-minute outage and the service immediately flushes 500,000 queued emails, the provider rate-limits the service and the outage effectively continues. FM9 (Silent Data Corruption) lurks in delivery tracking — a notification marked “delivered” because the provider accepted it, but never actually reaching the user.

F4 — Tradeoffs: What are we choosing?

AT1 (Consistency vs Availability) — we choose availability. The service accepts notification requests even if it cannot confirm delivery status immediately. Delivery tracking is eventually consistent. AT2 (Latency vs Throughput) — we choose throughput. Individual notification latency is not critical (a 5-second delay is acceptable); what matters is sustained delivery rate under load. These tradeoffs are explicit. If someone later asks “why don’t we guarantee delivery within 1 second?” the answer is documented: we traded latency for throughput, and the reversal condition is stated.

F5 — The 7 Review Questions (abbreviated):

What is the SLO? 99.9% of notifications delivered within 5 minutes; 99.99% within 1 hour.
What is the blast radius of a single component failure? One channel (email/SMS/push) — never all three.
Where is the single point of failure? The request ingestion endpoint. Mitigated by running multiple instances behind a load balancer.
What is the recovery procedure? Queue drains automatically on provider recovery; rate-limited flush prevents thundering herd.
What data can be lost? Delivery status may lag by up to 60 seconds. No notification requests are lost (persistent queue).
What is the scaling bottleneck? Queue consumer throughput per channel.
What monitoring tells you the system is degrading before it fails? Queue depth per channel, delivery latency P99, provider error rate.

F6 — Archetypes: Which pattern is this?

A2 (Communication) — the service exists to move messages between producers and external delivery endpoints. The A2 archetype carries inherited expectations: message ordering may not be guaranteed, at-least-once delivery is the natural default, and the system must handle poison messages (malformed notifications that fail repeatedly). Recognising the archetype means inheriting its known failure surface rather than rediscovering it.

F7 — Architecture Diagrams: What would you draw?

D1 (Request Flow) — trace a notification from API request through the queue, to the channel-specific consumer, to the external provider, and back via delivery receipt. Annotate latency at each hop. This diagram surfaces the FM2 cascading risk: the sequential chain from queue to provider is where slowdowns propagate.

D3 (Event-Driven / Async) — show the fan-out from request ingestion to three channel-specific queues, each with its own consumer group, retry policy, and dead-letter queue. This diagram surfaces the channel isolation question: are the queues truly independent, or do they share infrastructure that could create cross-channel coupling?

F8 — Infrastructure Components: What are the building blocks?

IC13 (Message Queue) — the core of the system. Kafka or SQS for durable, partitioned queuing with per-channel topics. IC8 (Background Worker) — channel-specific consumers that pull from the queue and call external providers. IC6 (Rate Limiter) — per-provider rate limiting to prevent thundering herd on recovery and to respect provider API limits. These are not implementation choices yet — they are the vocabulary for describing what the system needs before selecting specific technologies.

F9 — Empirical Laws: What constraints hold regardless of intent?

L4 (Little’s Law) governs queue sizing. If the average delivery rate is 1,000 notifications per second and the average time in the system is 3 seconds, the steady-state queue depth is 3,000. If provider latency doubles to 6 seconds, queue depth doubles to 6,000 — this is not a design choice, it is arithmetic. L1 (Amdahl’s Law) constrains scaling: if 20% of the delivery pipeline is serialised (e.g., deduplication lookup), then no amount of parallelism in the remaining 80% can yield more than a 5x throughput improvement.

What the unified traversal reveals:

No single framework produced the complete picture. F3 identified the cascading failure risk, but F4 named the tradeoff that created it (availability over consistency). F6 identified the archetype, but F5 asked the specific questions that exposed the blast radius. F9 provided the arithmetic that turns “the queue might grow” into “the queue will reach 6,000 at 2x latency.” The value of the map is not any individual framework. It is the path through all nine.

Concept: The Complete Mental Map

Thread: T12 (Tradeoffs) ← naming costs implicitly → making costs explicit across all frameworks

Core Idea: The nine frameworks form a single reasoning system; using them as isolated tools produces local analysis, not system understanding.

Tradeoff: AT6 — structured completeness vs rapid triage speed

Failure Mode: FM11 — organisational observability blindness; rich opinions without structured verification

Signal: When the same system produces conflicting analyses from different engineers — traverse all nine frameworks in order; the disagreement lives at the framework boundary being skipped

Maps to: Book 0, Frameworks 1–9

Read in the book →

← The Framework Traversal Reflection Questions →