The Computing Series

The 12 Failure Modes — Layer 1 (Recall Triggers)

# Failure Mode Recall Trigger
FM1 Single Point of Failure One component’s failure brings down the system
FM2 Cascading Failures One failure triggers the next, producing full outage
FM3 Unbounded Resource Consumption Memory, connections, threads consumed without limit
FM4 Data Consistency Failure Different components disagree on the state of the world
FM5 Latency Amplification Many small latencies multiply into an unacceptable total
FM6 Hotspotting One node receives disproportionate traffic; degrades or fails
FM7 Thundering Herd Many clients simultaneously retry, overwhelming the recovering system
FM8 Schema / Contract Violation One side of a boundary changes; the other side breaks
FM9 Silent Data Corruption Incorrect data propagates without triggering alerts
FM10 Security Breach Unauthorised access to data or compute resources
FM11 Observability Blindness System is failing but the team cannot see where or why
FM12 Split-Brain Two nodes each believe they are the primary, producing conflicting writes

Read in the book →