The Computing Series

Real Systems

Google’s testing at scale: Google runs billions of tests per day across its codebase. The vast majority are unit tests. The testing infrastructure (TAP — Testing Automation Platform) prioritises running only the tests affected by a change, keeping per-change feedback time below five minutes even in a monorepo with billions of lines of code. The pyramid is enforced structurally: end-to-end tests require explicit justification.

Pact — consumer-driven contract testing: Pact is an open-source framework for consumer-driven contract testing. Consumers define expectations as Pact files. Producers verify against those files. A Pact Broker publishes and retrieves contracts. The system makes contract verification part of each service’s CI, not a manual cross-team coordination step.

Netflix’s chaos engineering: Netflix tests at the top of the pyramid in production with Chaos Monkey — deliberately terminating random instances to verify that the system tolerates failures. This is an extreme form of end-to-end testing: test the failure modes, not just the success paths, in the actual production environment.

Stripe’s API testing: Stripe provides a test mode that mirrors production exactly. Every Stripe API call in test mode behaves identically to production except no real money moves. This is an infrastructure-level support for integration testing: the integration test environment is officially maintained and behaves deterministically.

Concept: Testing Strategy Thread: T9 (Consensus) ← Book 4, Ch 9 (Agreement protocols) → Ch 16 (Refactoring Techniques) Core Idea: The test pyramid — many unit tests, fewer integration tests, fewer contract tests, very few end-to-end tests — optimises for fast feedback and high reliability; the inverted pyramid (more end-to-end than unit tests) produces slow, flaky suites that developers avoid. Tradeoff: AT9 — Correctness/Performance: lower-layer tests give the same correctness signal as upper-layer tests in a fraction of the time; the total feedback loop speed determines how often tests are run and how much they actually prevent. Failure Mode: FM8 — Schema/Contract Violation: service interface changes that are not caught by contract tests reach production as compatibility failures; without contract tests, schema changes require manual cross-team coordination. Signal: When the test suite takes more than 10 minutes to run and developers routinely skip running tests locally, the test pyramid is inverted and end-to-end tests dominate. Maps to: Reference Book, Framework 9 (Review Questions)

Read in the book →