WhatsApp processes 100 billion messages per day across 2 billion users. Each message must be delivered within seconds of being sent, even when the recipient’s device is offline, in a country with intermittent connectivity, on a battery-constrained phone. The system must maintain persistent connections for billions of devices simultaneously, route messages to the correct connection, and guarantee delivery without duplicates.
Real-time chat sits at the intersection of several distributed systems challenges: persistent connection management, message ordering, delivery guarantees, and presence detection. Each problem looks tractable in isolation. The difficulty is that they interact. Solving ordering breaks delivery guarantees. Solving delivery guarantees creates duplicate risks. The production architecture is a careful balance between these tensions.