The Computing Series

Introduction

In 1998, Sergey Brin and Larry Page described the algorithm they used to map the web. It was not sophisticated. It maintained a frontier of URLs to visit, fetched each page, extracted links, and added unvisited links to the frontier. The frontier was a queue. The traversal was breadth-first.

That same year, npm was a decade away from existing. When npm arrived, it needed to determine which packages to install when you added a dependency. It traversed the dependency graph starting from your package, visited each dependency recursively, and detected cycles to prevent infinite loops. The traversal used a stack (implicitly, through recursion). It was depth-first.

The same two graph traversal algorithms — BFS with a queue, DFS with a stack — appear in different forms across every layer of the computing stack. The structure of the problem determines which traversal you need. The engineering around that traversal (politeness delays, cycle detection, visited sets) is what distinguishes a working system from a homework assignment.


Read in the book →