The question it answers: For a user-facing operation, what is the path of a request from the client to the data and back? What is the latency at each hop? What happens if any component fails?
What it contains: - Every component that a request touches, in sequence - The latency annotation on each hop (typical and P99) - The happy path (solid arrows) - At least one failure path (dashed arrows) — what happens when the downstream component is slow or unavailable? - The data transformation at each step (what is being passed between components)
When to draw it: Any time you are designing a user-facing feature and need to validate the latency budget. Any time you are debugging a latency problem and need to identify which hop is the bottleneck.
What it reveals: FM5 (Latency Amplification) — chains of sequential hops are visible on this diagram before they become production problems. The critical path is the longest sequential chain of hops.
The rule: Always show the failure path. A request flow diagram that only shows the happy path is incomplete. The failure path reveals whether the system fails gracefully (returns a degraded response) or fails catastrophically (returns an error or times out).
The question it answers: Where is all the data? Who owns each store? What is the consistency model for each? What is the replication topology?
What it contains: - Every persistent store (databases, caches, object stores, message queues with retention) - The owner of each store (which service writes to it) - The consistency model for each store (strong, eventual, none) - The replication topology for each store (single node, leader-follower, multi-region) - The read/write path between services and stores
When to draw it: Any time you are designing a feature that introduces a new store or modifies data access patterns. Any time you are trying to understand a consistency or data loss incident.
What it reveals: FM4 (Data Consistency Failure) — stores that are written by multiple services without a consistency protocol are visible on this diagram. FM12 (Split-Brain) risk — stores without clear leader election are visible. FM1 (SPOF) — single-node stores on the critical path are visible.
The rule: Every store must have one named owner. If two services write to the same store, that store is a coordination point that requires an explicit consistency model. Name it.
The question it answers: What are the asynchronous dependencies? What produces events, what consumes them, and what is the ordering and delivery guarantee?
What it contains: - Every message queue, event stream, or pub/sub channel - Producers (who writes events, at what rate) - Consumers (who reads events, with what consumer group) - The delivery guarantee for each channel (at-most-once, at-least-once, effectively-once) - The ordering guarantee (total order, per-partition order, no order) - The retry and dead-letter queue configuration
When to draw it: Any time the system has asynchronous components — message queues, event sourcing, notification pipelines, data fan-out. Any time you are debugging a data consistency issue that only appears after a delay.
What it reveals: Async coupling that is not visible on the request flow diagram. A system that looks simple from the synchronous request path may have complex async dependencies that produce data consistency issues, ordering failures, or unbounded queue depth.
The rule: Always annotate the delivery guarantee. “Events are consumed” is incomplete. “Events are consumed at-least-once, with consumer offsets committed after processing, and a dead-letter queue for unprocessable messages” is complete.
The question it answers: How does data flow from raw events to queryable results? What is the latency from event to availability? What is the consistency between pipeline stages?
What it contains: - The ingestion layer (how raw events enter the system) - The processing stages (transformation, aggregation, enrichment) - The storage layers (raw event store, processed store, serving store) - The latency annotation between each stage (seconds, minutes, hours) - The schema and format at each stage - The backfill and reprocessing capability
When to draw it: Any time the system processes events to produce derived data — analytics, ML features, search indexes, materialised views. Any time you need to understand the freshness of derived data.
What it reveals: FM8 (Schema / Contract Violation) between pipeline stages — the schema mismatch may be invisible in the system design but visible when the output of stage N is the input of stage N+1 with different field names. FM9 (Silent Data Corruption) from late-arriving events or incorrect aggregations.
The rule: Always annotate the latency at each stage and the schema at each stage boundary. A pipeline diagram without these annotations cannot be used to reason about freshness or correctness.
The question it answers: Where does the system require multiple nodes to agree on something? What happens when they cannot agree? What is the quorum configuration?
What it contains: - Every component that participates in leader election or consensus - The quorum configuration (N nodes, R reads, W writes such that R + W > N) - The fencing or epoch mechanism that prevents split-brain - The behaviour under partition (which side remains available, which side halts) - The recovery sequence when a partitioned node rejoins
When to draw it: Any system with replicated state, leader election, distributed locks, or distributed transactions. Any time you are designing for high availability and need to understand the consistency guarantees.
What it reveals: FM12 (Split-Brain) — systems with consensus mechanisms that can produce two leaders simultaneously are visible on this diagram. FM1 (SPOF) — the quorum configuration determines how many node failures can be tolerated; fewer than needed is a SPOF.
The rule: Always show the partition case. What does the system do when the network between two groups of nodes is severed? If the diagram only shows the healthy case, the consistency and availability properties of the system under failure are undefined.