The Computing Series

The Twelve Recurring Threads

You are debugging a production incident. The message queue is backing up. Consumers cannot keep up with producers. Six months ago you saw the same pattern: a database connection pool exhausted, queries queuing behind it. A year before that, a thread pool that could not drain fast enough. Three different systems, three different on-call alerts, one structure: work arriving faster than capacity to process it.

If you had named the pattern the first time, you would have recognised all three as the same problem. The solution space is also the same: bound the queue depth, add consumers, or slow the producer. You spent three separate incident nights reinventing the same answer.

The twelve threads are names for the patterns that recur across every layer of the computing stack. Name them, and you stop reinventing. Name them, and you navigate between layers without getting lost.

This is the first chapter where all twelve threads appear explicitly. Each thread is introduced here by name, root form, and cross-stack manifestation. Every subsequent chapter in this book, and every chapter in all seven series books, will reference one or more of these threads by number.

What a Thread Is

A thread is a pattern that appears at multiple levels of abstraction with the same essential shape. The pattern is introduced at its simplest form — usually a mathematical or algorithmic concept — and then re-appears at each layer of the stack with increasing complexity, different failure modes, and higher stakes.

Thread T5, for example, is Caching. Its root form is memoization: store the result of an expensive function call so you do not recompute it. At the algorithm layer it is dynamic programming — systematic memoization to solve overlapping subproblems. At the infrastructure layer it is a Redis cluster with TTL-based invalidation. At the system design layer it is the CDN in front of a video platform, or the precomputed news feed stored per user. The shape is identical at every layer. The complexity and the consequences of getting it wrong scale upward.

The Twelve Threads

# Thread Root Form Infrastructure Form System/Product Form
T1 Hashing Modular arithmetic Consistent hashing; sharding ID generation; deduplication
T2 Trees Hierarchical sorted structure B-tree index; trie File systems; search prefix
T3 Graphs Edges and vertices; traversal Service dependency graph Social graph; routing; crawler
T4 Queues FIFO ordered processing Message queue; event stream Fan-out; async task processing
T5 Caching Store result, avoid recompute Redis; CDN; read-through cache Precomputed feeds; suggestion indexes
T6 Redundancy Copies protect against loss Database replication; RAID Multi-region; active-active
T7 State Machines States with transitions Event sourcing; CQRS Order lifecycle; trip state
T8 Divide & Conquer Split, solve, combine MapReduce; sharding Microservices decomposition
T9 Consensus Agreement under failure Raft; Paxos; quorum reads Distributed transactions; leader election
T10 Encoding Represent information Serialisation; compression API versioning; data schema
T11 Feedback Loops Output feeds back to input Rate limiting; circuit breaker Surge pricing; A/B testing
T12 Tradeoffs Optimise under constraints CAP theorem; SLO design Build vs buy; roadmap decisions

Each Thread in Detail

T1 — Hashing

Root form: A hash function maps a large domain to a small codomain — many keys to fewer buckets. It is a surjection, not a bijection: collisions are inevitable.

Why it recurs: Any problem that requires distributing, deduplicating, or addressing data at scale involves mapping a large key space onto a bounded structure. The hash function is the mechanism.

Cross-stack: Hash table (data structure) → content-addressed storage in Git → consistent hashing for distributed cache node assignment → partition key in Kafka topics → globally unique ID generation.

Consistent hashing uses the same hash function on both keys and nodes, placing them on the same ring. When a node leaves, only its clockwise neighbours absorb its keys. This is not a clever trick — it is a direct consequence of applying the surjection property to both keys and servers simultaneously.


T2 — Trees

Root form: A tree is a graph with no cycles. A binary search tree maintains sorted order through the left-child ≤ parent ≤ right-child invariant, enabling O(log n) search.

Why it recurs: Hierarchical structure and sorted order appear everywhere. File systems, DOM structures, decision logic, and database indexes all have a natural tree shape.

Cross-stack: BST → B-tree (wider nodes to minimise disk reads) → B+ tree (linked leaves for range scans) → every PostgreSQL and MySQL index → trie for prefix matching → vector search using HNSW (hierarchical navigable small world graph).

A B-tree is a BST given permission to have many children per node. The reason is physical: disk reads are expensive. A BST with height 30 needs 30 disk reads to find a value. A B-tree with branching factor 100 needs 2–3. The algorithm did not change; the hardware constraint changed it.


T3 — Graphs

Root form: A graph is a set of vertices and edges. Directed graphs have one-way edges; weighted graphs have edge costs. BFS and DFS are the two fundamental traversal strategies.

Why it recurs: Relationships between things are inherently graph-shaped. Dependencies, routes, social connections, network topology — all are graphs. Any problem involving connectivity, reachability, or ordering is a graph problem.

Cross-stack: BFS → web crawler frontier → dependency resolution → shortest path as network routing (OSPF, BGP) → bipartite matching as marketplace driver assignment → topological sort as build pipeline ordering → social graph at billion-node scale.

Conway’s Law is a graph theory claim: the communication graph of an organisation determines the architecture graph of the system it builds. Teams own modules; module dependencies follow team dependencies. To change the architecture, change the graph.


T4 — Queues

Root form: A queue is FIFO: first in, first out. It decouples the producer of work from the consumer. Work can be produced faster than it can be consumed — the queue absorbs the difference temporarily.

Why it recurs: Decoupling is one of the fundamental architectural goals. Any place where one process produces work and another consumes it — and they do not need to be synchronised — a queue is the right structure.

Cross-stack: FIFO queue (data structure) → BFS traversal (uses a queue for the frontier) → OS process scheduler → message queue (Kafka, SQS) → event streaming → async task processing → fan-out in a notification pipeline.

Kafka is a queue that never discards. Consumers track their own offset. This means multiple consumer groups can read the same message at different speeds — the queue becomes a replayable log. This transforms the queue from a transfer mechanism into an event source of truth.


T5 — Caching / Memoization

Root form: Store the result of an expensive computation indexed by its inputs. On subsequent calls with the same inputs, return the stored result. This is memoization — the mechanism behind dynamic programming.

Why it recurs: Computing the same thing twice is never free. Any system with repeated requests for the same data is a candidate for caching. The challenge is not storage — it is invalidation: deciding when the stored result is no longer valid.

Cross-stack: Memoization in code → DP table in algorithms → Redis cache in infrastructure → CDN edge cache → precomputed recommendation lists → materialized views in databases → CPU L1/L2/L3 cache in hardware.

The reason cache invalidation is hard is not technical — it is logical. A cache stores a value that was correct at a point in time. Validity depends on whether the underlying data has changed, which requires either knowing when it changed (write-invalidation) or accepting that it might be stale (TTL). Both are tradeoffs with no universally correct answer.


T6 — Redundancy

Root form: Keep multiple copies of the same data or capability. If one copy fails, another serves the request. Redundancy converts single points of failure into tolerated failures.

Why it recurs: Hardware fails. Networks partition. Processes crash. Any system that must remain available while individual components fail must have redundancy. The question is never whether to have redundancy but where to place it and how many copies to keep.

Cross-stack: RAID on a disk → database replication (leader-follower, multi-leader, leaderless) → stateless service replicas behind a load balancer → multi-region deployment → redundant network paths in BGP.

Redundancy does not eliminate failure; it changes the failure mode. With one database server, failure is data unavailability. With three servers, failure becomes a consistency question: which copy is authoritative? The Thread T9 (Consensus) exists entirely to answer this question.


T7 — State Machines

Root form: A finite state machine has a defined set of states, a set of inputs (transitions), and rules that determine which state to move to on each input. Only valid transitions are allowed.

Why it recurs: Any system that remembers something between requests is a state machine. Order processing, authentication sessions, database transactions, network connections — all have a finite set of states and defined transitions between them.

Cross-stack: Finite automata (theory) → regex engine (DFA) → TCP connection lifecycle (SYN_SENT, ESTABLISHED, CLOSE_WAIT) → event sourcing (state as accumulated event log) → CQRS (separate write and read state representations) → order lifecycle (PENDING → CONFIRMED → SHIPPED → DELIVERED).

Event sourcing makes the state machine explicit by storing the transitions rather than the current state. The current state is a function of the event log — it can always be reconstructed. This gives you audit, time travel, and the ability to replay events through a corrected state machine. The cost is that projections must be rebuilt when the state machine changes.


T8 — Divide and Conquer

Root form: Split the problem into sub-problems of the same type. Solve each independently. Combine the solutions. Merge sort is the canonical example: split the array, sort each half, merge.

Why it recurs: Problems that cannot fit in one machine must be split across many. Systems that are too large for one team must be decomposed. The same recursive splitting strategy that works for sorting an array works for distributing a dataset across a cluster.

Cross-stack: Merge sort → external merge sort → MapReduce (split → map → shuffle → reduce) → distributed binary search within a shard → microservices as organisational divide and conquer → parallel CI/CD pipelines.

MapReduce is merge sort. The map phase processes each partition independently (like sorting each half of the array). The shuffle phase routes intermediate results (like the split step). The reduce phase combines (like the merge step). At 10,000 machines, it is still the same algorithm.


T9 — Consensus

Root form: Multiple agents must agree on a value despite the possibility that some agents are unavailable or send conflicting messages. This is the distributed agreement problem.

Why it recurs: Any distributed system with replicated state must answer: which copy is correct? Any distributed system making a decision (leader election, transaction commit) must answer: have enough participants agreed? Consensus is the formal mechanism for answering both.

Cross-stack: Proof verification (logic) → quorum reads/writes (R + W > N) → Paxos and Raft consensus protocols → ZooKeeper leader election → two-phase commit for distributed transactions → idempotency keys for payment deduplication.

Raft was designed to be understandable, not just correct. The key simplification: one leader owns all writes. This eliminates the multi-leader conflict problem at the cost of a new SPOF. Every consensus algorithm is trading off something — Raft trades availability (the leader is a bottleneck) for understandability.


T10 — Encoding

Root form: Information can be represented in multiple ways. The choice of representation determines what operations are efficient, what invariants are maintained, and what can be communicated across system boundaries.

Why it recurs: Every serialisation, every API contract, every data schema, every compression algorithm is an encoding decision. The choice of encoding determines how data can be processed, validated, and evolved.

Cross-stack: Binary representation → character encoding (ASCII, UTF-8) → data serialisation (JSON, Protobuf, Avro) → API schema design → database schema → vector embeddings as encoding of semantic meaning.

Hyrum’s Law (F9 #16) is an encoding claim: once a system is in use, all observable behaviours — including the encoding of the response — become part of the implicit contract. You can change the content; you cannot change the structure without breaking callers who depended on a detail you thought was an implementation choice.


T11 — Feedback Loops

Root form: A system whose output feeds back as input to itself. Negative feedback (output reduces input) creates stability. Positive feedback (output increases input) creates amplification — and instability.

Why it recurs: Every control system, every monitoring system, every pricing algorithm, every recommendation engine is a feedback loop. The behaviour of the loop — whether it stabilises, oscillates, or diverges — determines whether the system is reliable or dangerous.

Cross-stack: Recursive algorithms → rate limiting (request rate feeds back to throttle decision) → circuit breaker (error rate feeds back to trip the breaker) → load balancer health checks → auto-scaling policies → surge pricing (demand feeds back to price, which feeds back to demand) → recommendation engine (engagement feeds back to training data).

Goodhart’s Law (F9 #15) is a feedback loop pathology: when a measure becomes a target, it ceases to be a good measure. The system optimises for the metric rather than the underlying objective. The feedback loop is still functioning — it is just optimising the wrong thing.


T12 — Tradeoffs

Root form: Every algorithm choice, every architecture decision, every engineering policy is an optimisation under constraints. There is no free lunch — improving one dimension costs another.

Why it recurs: Because physics is real. You cannot have low latency, high throughput, low cost, and perfect consistency simultaneously. Every system is a specific set of tradeoff choices, made explicitly or by default. The engineer who names the tradeoffs owns the decisions. The engineer who does not name them is surprised by the consequences.

Cross-stack: Algorithm complexity analysis (time vs space) → CAP theorem (consistency vs availability vs partition tolerance) → caching (consistency vs performance) → microservices (coupling vs deployability) → build vs buy (differentiation vs total cost of ownership) → CTO roadmap decisions (speed vs risk).

The CAP theorem is often quoted and rarely understood precisely. It does not say you must choose two of three at all times. It says: during a network partition, you must choose between consistency and availability. When there is no partition — which is most of the time — you can have both. Name the condition, not the conclusion.

Thread Interactions

Threads do not operate in isolation. The most common interactions:

T5 × T7 — Caching and State Machines: caches are state. Caching the result of a stateful computation means the cache must be invalidated whenever the state machine transitions. Forgetting this is how stale data produces wrong answers.

T1 × T6 — Hashing and Redundancy: consistent hashing (T1) is the mechanism that makes distributed replication (T6) tractable. Without it, adding a replica requires remapping all keys. With it, only the adjacent segment migrates.

T9 × T7 — Consensus and State: distributed systems replicate state (T7) across machines. Consensus (T9) is what determines which replica’s state is authoritative when they diverge.

T11 × T12 — Feedback and Tradeoffs: every feedback loop encodes a tradeoff. A circuit breaker trades availability (some requests are refused) for stability (the system does not collapse). A rate limiter trades throughput for fairness. Name the tradeoff or the feedback loop’s behaviour will surprise you.

T1 × T6 — Hashing and Redundancy: consistent hashing with virtual nodes places replicas at deterministic positions on the hash ring. Each key is stored on the next N distinct physical nodes clockwise. Adding or removing a node redistributes only 1/N of the keys — redundancy becomes cheap to rebalance.

T7 × T9 — State Machines and Consensus: distributed state machine replication is the core idea behind Raft. Every replica runs the same state machine; the consensus protocol ensures they apply the same commands in the same order. Agreement on the log is agreement on the state.

T3 × T5 — Graphs and Caching: graph traversals are expensive at scale — computing a social feed requires walking edges, ranking, and filtering. Precomputing and caching the result (the materialised feed) converts a graph query into a key-value lookup. Facebook’s TAO and Twitter’s fanout-on-write are both cached graph traversals.

T4 × T8 — Queues and Divide & Conquer: partitioned message processing is MapReduce applied to streams. Kafka topic partitions split the input (divide); each consumer processes its partition independently (conquer); downstream aggregation combines the results. The queue provides the partitioning boundary.

T2 × T1 — Trees and Hashing: hash-partitioned B-tree indexes distribute a single logical index across shards. The hash function determines which shard holds a given key range; the B-tree within each shard provides sorted access. This is how distributed databases (CockroachDB, Spanner) scale indexing.

T5 × T11 — Caching and Feedback Loops: adaptive cache TTL uses hit-rate feedback to tune expiry. High hit rate → extend TTL (data is popular and stable). Low hit rate → shorten TTL (data is volatile or unpopular). Without this feedback loop, TTLs are static guesses that waste memory or serve stale data.

T7 × T11 — State Machines and Feedback Loops: circuit breakers are state machines driven by feedback. States: CLOSED → OPEN → HALF-OPEN. The transition from CLOSED to OPEN is triggered by error-rate feedback exceeding a threshold. The transition from OPEN to HALF-OPEN is triggered by a timeout. Retry backoff is the same pattern — the retry delay is a function of failure count feedback.

T6 × T9 — Redundancy and Consensus: quorum-based replication ties redundancy to consensus. With N replicas, requiring W writes and R reads where W + R > N guarantees overlap — at least one read will hit an up-to-date replica. The quorum formula is the mathematical bridge between how many copies you keep and how many you must consult.

T3 × T9 — Graphs and Consensus: leader election is a graph problem. Nodes form a communication graph; the consensus protocol selects exactly one leader from the connected component. Network partitions split the graph — each partition may elect its own leader, creating split-brain. The consensus protocol’s job is to prevent this or detect and resolve it.

T10 × T12 — Encoding and Tradeoffs: schema evolution is a tradeoff between forward compatibility and simplicity. Protobuf’s field numbering allows adding fields without breaking old readers (forward compatible) at the cost of never reusing field numbers. Avro’s schema registry allows full evolution at the cost of a coordination service. Every encoding format encodes a bet about how the data will change.

Read in the book →