Exercises

Level 2 — Apply

A batch processing system uses a Redis Redlock to ensure only one worker processes each job. Workers sometimes pause for up to 8 seconds during garbage collection. The current TTL is 5 seconds. Describe two separate failure scenarios caused by this TTL, and explain how fencing tokens would change the outcome of each scenario.
A system uses ZooKeeper ephemeral nodes for distributed locking. The ZooKeeper session timeout is set to 2 seconds. A network partition isolates a lock holder for 3 seconds before recovering. Walk through what happens to the lock during and after the partition, and identify which process holds the lock after recovery.
A database has two types of writes: high-frequency low-conflict user profile updates (thousands of users, each user’s record updated independently) and low-frequency high-conflict inventory reservation (hundreds of workers competing for the same stock records). Choose between CAS and distributed locks for each workload and justify the choice in terms of contention and retry cost.

Level 3 — Design

Design a distributed lock service for a payment processing system that cannot tolerate duplicate charges. The system runs on five geographically distributed data centres. Define the lock acquisition protocol, the fencing token mechanism, the TTL strategy, and what the application does when lock acquisition fails. Identify the residual failure mode that your design does not eliminate.

A complete answer will: (1) specify the lock acquisition protocol using a quorum of 3 of 5 data centres (Redlock or equivalent) with a fencing token that increments monotonically on each acquisition, (2) identify FM12 (network partition causing a lock holder to lose quorum and a second client to acquire the lock) as the residual failure mode that the design cannot fully eliminate, (3) address the AT1 tradeoff between lock TTL length (shorter = faster recovery from holder failure, higher risk of lock expiry during slow operations) and TTL that is too long (holder can complete but recovery is slow), and (4) specify what the application does when lock acquisition fails — return an error to the caller rather than waiting or retrying indefinitely — and justify this as the correct behaviour for a payment system.

A job queue has one million pending jobs distributed across one hundred workers. Workers acquire a per-job lock before processing. At peak, 80% of workers are waiting for locks simultaneously. Design a partitioning strategy that eliminates or reduces lock contention. Explain what invariant your design preserves in place of the lock, and what new failure mode it introduces.

A complete answer will: (1) identify the correct partitioning strategy — assign jobs to workers by hashing job_id to worker_id, so each worker owns a disjoint partition and no lock is needed, (2) identify FM1 (single point of failure — if a worker dies, its partition stalls until reassigned) as the new failure mode the partitioning strategy introduces in place of the lock, (3) address the AT5 tradeoff between centralised locking (all workers can process any job, higher contention) and partitioned ownership (no contention, worker failure blocks a partition), and (4) specify the invariant the design preserves — exactly-once processing per job — and describe the rebalancing protocol when a worker joins or leaves.

You are building a distributed rate limiter that enforces a per-user limit of 100 requests per minute across a cluster of 50 API servers. The naive approach uses a distributed lock per user per minute window. Analyse why this design has prohibitive latency under load, then design a token bucket implementation using Redis atomic operations (without a distributed lock) that provides approximate rate limiting with bounded error. Define what “approximate” means in terms of the percentage of requests that could exceed the limit in the worst case.

A complete answer will: (1) correctly explain why the per-user distributed lock creates a serialisation bottleneck — each request must acquire and release a lock on a shared resource, limiting throughput to 1 request per lock RTT per user — and quantify the latency impact at high concurrency, (2) design the Redis token bucket using EVALSHA or a Lua script that atomically reads the bucket count and decrements it in a single round-trip, with the approximate error bounded by the number of API servers that can race before the next Redis sync, (3) identify FM6 (a single Redis shard serving all user buckets becomes a hotspot for high-traffic users) as the failure mode, and (4) define “approximate” precisely — up to N×limit requests may be accepted in a window where N is the number of API servers in the worst-case race, before the bucket is synchronised.

Read in the book →

← Real Systems