The 5 Architecture Diagram Types Every Engineer Should Know

Show an engineer a box-and-arrow diagram labelled "System Architecture" and they will nod and say it looks reasonable. Show them five specific diagrams — each asking a different question about the same system — and they will find at least two problems the first diagram hid.

The five architecture diagrams are not different ways to draw the same picture. They are five different questions, each of which is only visible from a specific angle.

Why One Diagram Is Never Enough

A single architecture diagram tries to show the request path, the data storage model, the async dependencies, the pipeline stages, and the consensus topology simultaneously. To include everything, it becomes too complex to read. To be readable, it hides the details that matter.

The solution is not a better single diagram. It is five focused diagrams, each answering exactly one question.

# Diagram The question it answers
D1 Request Flow What is the path of a request? What is the latency at each hop?
D2 Data Storage Where is all the data? What is the consistency model for each store?
D3 Event-Driven / Async Coupling What are the asynchronous dependencies and their delivery guarantees?
D4 Data Pipeline How does data flow from raw events to queryable results? What is the freshness?
D5 Distributed Coordination Where does the system require multiple nodes to agree? What happens during a partition?

Each diagram reveals one failure surface that the others cannot show.

D1 — Request Flow

The question: For a user-facing operation, what is the path of a request from the client to the data and back? What is the latency at each hop? What happens if any component fails?

A request flow diagram shows every component a request touches, in sequence, with latency annotated on each hop. It shows the happy path and at least one failure path — what happens when a downstream component is slow or unavailable.

This diagram surfaces latency amplification: chains of sequential hops that look acceptable individually but add up to a P99 that breaks the latency budget. The critical path is the longest sequential chain.

The rule: Always show the failure path. A request flow diagram that only shows the happy path is incomplete. The failure path reveals whether the system fails gracefully (returns a degraded response) or fails catastrophically.

Example — API request for a user profile:

Client ──20ms──▶ CDN ──5ms──▶ Load Balancer ──2ms──▶ App Server
                                                           │
                                              3ms ──▶ Cache (hit)
                                             15ms ──▶ PostgreSQL (miss)
                                                           │
                                         [failure path]   │
                                         DB slow ──▶ return stale cached profile
                                                     (not an error)

Happy path: ~45ms   P99: ~120ms (cache miss)

The sequential chain from client to database is the critical path. The failure path shows the system degrades to stale data rather than returning an error — a deliberate design choice that must be visible on the diagram.

D2 — Data Storage

The question: Where is all the data? Who owns each store? What is the consistency model for each? What is the replication topology?

A data storage diagram shows every persistent store — databases, caches, object stores, queues with retention — with the service that owns each one and the consistency model it offers. It shows the read/write paths between services and stores.

This diagram surfaces data consistency failures (stores written by multiple services without a consistency protocol), split-brain risk (stores without clear leader election), and single points of failure (single-node stores on the critical path).

The rule: Every store must have one named owner. If two services write to the same store, that store is a coordination point that requires an explicit consistency model. Name it on the diagram.

Example — E-commerce data ownership:

Users Service ──────▶ PostgreSQL (strong, leader-follower)
Products Service ───▶ MongoDB (eventual, replica set)
Sessions Service ───▶ Redis (volatile, single node) ← FM1: SPOF

Redis is single-node and volatile — a single point of failure for session data. MongoDB's eventual consistency model means product reads may lag writes. These risks are invisible on a request flow diagram.

D3 — Event-Driven / Async Coupling

The question: What are the asynchronous dependencies? What produces events, what consumes them, and what is the ordering and delivery guarantee?

An event-driven diagram shows every message queue, event stream, or pub/sub channel, with producers, consumers, delivery guarantees (at-most-once, at-least-once, effectively-once), ordering guarantees, and dead-letter queue configuration.

Async coupling that is invisible on the request flow diagram is visible here. A system that looks simple from the synchronous request path may have complex async dependencies that produce data consistency issues, ordering failures, or unbounded queue depth.

The rule: Always annotate the delivery guarantee. "Events are consumed" is incomplete. "Events are consumed at-least-once, with consumer offsets committed after processing, and a dead-letter queue for unprocessable messages" is complete.

Example — Order processing fan-out:

Order Service ──▶ Kafka (Order Topic)
  (500 msg/s)         │
                      ├──▶ Inventory Service  (at-least-once)
                      ├──▶ Payment Service    (at-least-once)
                      └──▶ Notification Svc   (at-most-once) ──▶ DLQ

Inventory and Payment use at-least-once delivery because they must not lose orders. Notification uses at-most-once because duplicate emails are worse than a missed one. The DLQ catches messages that fail after max retries. None of this is visible on the request flow or data storage diagrams.

D4 — Data Pipeline

The question: How does data flow from raw events to queryable results? What is the latency from event to availability? What is the consistency between pipeline stages?

A data pipeline diagram shows the ingestion layer, processing stages, storage layers, and the latency annotation between each stage. It also shows the schema at each stage boundary and the backfill and reprocessing capability.

This diagram surfaces schema contract violations between pipeline stages and silent data corruption from late-arriving events or incorrect aggregations.

The rule: Always annotate latency at each stage and schema at each stage boundary. A pipeline diagram without these annotations cannot be used to reason about freshness or correctness.

Example — Clickstream to analytics dashboard:

App DB ──1s──▶ CDC (Debezium) ──2s──▶ Kafka (raw) ──30s──▶ Flink (aggregates)
                                                                    │
                                                             ClickHouse ──5s──▶ Grafana

Total event-to-dashboard: ~35–40s

Each stage is annotated. A user action takes roughly 35–40 seconds to appear on the dashboard. If the pipeline is backfilling, this can grow to hours. This staleness is invisible on D1 or D2.

D5 — Distributed Coordination

The question: Where does the system require multiple nodes to agree on something? What happens when they cannot agree? What is the quorum configuration?

A distributed coordination diagram shows every component that participates in leader election or consensus, the quorum configuration, the fencing or epoch mechanism that prevents split-brain, and the behaviour under partition — which side remains available, which side halts.

This diagram surfaces split-brain (systems that can produce two leaders simultaneously) and SPOF from insufficient quorum (a cluster configured with fewer replicas than needed to tolerate a failure).

The rule: Always show the partition case. A diagram that only shows the healthy case leaves the consistency and availability properties of the system under failure undefined.

Example — 3-node database cluster (Raft):

Normal:
  Node A (Leader, epoch 42)
    ├──heartbeat──▶ Node B (Follower, ep:42)
    └──heartbeat──▶ Node C (Follower, ep:42)

Partition:
  [Node A isolated] ←─ partition ─▶ Node B (new Leader, ep:43) ──▶ Node C (Follower, ep:43)
  Node A steps down.
  B+C form majority (2 of 3). A's writes with epoch 42 are fenced by epoch 43.

The epoch (fencing token) prevents Node A from accepting writes after partition recovery. Without it, A and B could both accept writes simultaneously — split-brain.

Which Diagram for Which Problem

If you need to know... Draw this Not this
Where latency comes from D1 Request Flow D2, D3, D4 — they do not show the synchronous request path
What happens if a disk fails D2 Data Storage D1 — it shows traffic, not data durability
What happens mid-message if a service crashes D3 Event-Driven D1 — it shows the synchronous path, not async coupling
How stale the analytics dashboard can be D4 Data Pipeline D1, D2 — they do not show pipeline freshness
What happens during a network partition D5 Distributed Coordination All others — none show the partition failure path

The D5 diagram is the most commonly omitted and the most commonly needed in post-mortems.

In Practice

The most efficient way to audit a system is to draw all five diagrams from whatever documentation exists. Gaps in the diagrams are gaps in the design.

For a typical SaaS application:

Diagram What it reveals
D1 Request Flow The API → service → database call chain; whether any external calls are in the critical path
D2 Data Storage Which tables are shared between services (hidden coupling); whether the cache has a consistency model
D3 Event-Driven The background job queue; whether it has a dead-letter queue and retry policy
D4 Data Pipeline The analytics pipeline from events to dashboard; the lag between user action and visible metric
D5 Coordination Whether the database has a leader-follower setup; what happens on leader failover

None of these five diagrams can answer the other four's questions. All five together reveal the complete failure surface.