The Computing Series

Group 4 — Statistical Heuristics

These are rough empirical rules — useful as starting points for estimation and design, but requiring validation with actual data.


L15 — Pareto Principle (80/20 Rule)

The Heuristic: Roughly 80% of the effects come from 20% of the causes. 80% of traffic hits 20% of content. 80% of bugs come from 20% of the code. 80% of the benefit comes from 20% of the features.

The engineering application: Optimise the common case. Cache the 20% of content that receives 80% of requests. Fix the 20% of code that produces 80% of bugs. The returns on optimising the tail (the other 80%) are dramatically smaller.

The calibration: The 80/20 split is a heuristic, not a law. Measure the actual distribution for your system. For some systems the hot/cold split is 99/1 (99% of video views go to the top 1% of videos). For others it is 60/40. The principle holds; the numbers vary.


L16 — Linus’s Law

The Observation: Given enough eyeballs, all bugs are shallow.

The engineering application: Open review processes find bugs that closed processes miss. Code review with multiple reviewers, public bug reports on open source software, and architecture reviews with diverse perspectives all apply this principle.

The calibration: This is a tendency, not a guarantee. Some bugs are found only through specialised security research regardless of how many engineers review the code. The law is most useful as an argument for review and visibility, not as a guarantee of correctness.


Read in the book →