Load Balancing

Introduction

In 2016, a single misconfigured DNS record took down a major cloud provider’s load balancer for four hours. During that time, every application behind it was unreachable. The load balancer had been the indirection layer between the internet and thousands of servers — the one component whose failure made every other component irrelevant.

Load balancing solves a straightforward problem: you have more requests than one server can handle, and you have multiple servers. Distribute the requests. The solution is straightforward; the failure modes are not. A load balancer that distributes requests evenly to servers that are already failing makes things worse, not better. A load balancer that is itself a single point of failure defeats the purpose of having multiple servers.

Thread Activation

This chapter activates T11 (Feedback) at the infrastructure level. In Book 1, Chapter 35, feedback loops controlled iterative algorithms. In Book 2, Chapter 19, measurement and adjustment were the basis of benchmarking. A load balancer with health checks is a feedback loop: measure server health → adjust routing → measure again. The chapter also continues T8 (Divide and Conquer): distributing load across N servers is the infrastructure form of dividing a problem into N independent subproblems. In Book 4, the load balancer appears as a standard component in every distributed system design.