Exercises

Level 1 — Understand

State the four steps of the method introduced in this chapter. For each step, explain what it produces.
A colleague says “the database is slow.” Using the vocabulary of this chapter, what is missing from that statement? What four questions would you ask to make it precise?
What is the difference between an algorithm’s complexity class and its constant factor? Give one example where the constant factor matters more than the complexity class.

Level 2 — Apply

A rate limiter tracks how many requests a given API key has made in the last 60 seconds. It must check the count and increment it on every request. The system currently uses a Redis sorted set, storing each request as a member with its Unix timestamp as the score. To check the rate limit, it queries the count of members in the score range [now - 60, now].

Apply the four-step method to this system. Name the data structure, algorithm, and complexity class.
Redis sorted sets use a skip list. What is the time complexity of the range query? How does this compare to a simpler data structure (a plain list)?
An engineer proposes replacing the sorted set with a fixed-size circular buffer of 60 counters — one per second — and incrementing the appropriate bucket on each request. Apply the four-step method to this alternative. What is the complexity? What has changed?

Level 3 — Design

A search autocomplete system must, given a prefix string typed by the user, return the top 10 matching completions from a corpus of 50 million product names. The system must respond in under 20 milliseconds at the 99th percentile.

Apply the four-step method. What is the bottleneck operation? What data structure would you choose? What algorithm? What is the complexity?
A trie (prefix tree) is O(k) for prefix lookup, where k is the length of the prefix. A hash map of all possible prefixes is O(1) for lookup but O(total_prefixes) in space. Which would you choose? Name the tradeoff explicitly using AT framework notation.
The corpus is updated 1,000 times per minute (new products added, old products removed). How does this affect your data structure choice? What is the additional complexity introduced by updates?

A complete answer will: (1) identify the correct data structure for prefix lookup (trie) and justify it over a hash map using AT2 (Space/Time tradeoff) with specific memory estimates for 50 million product names, (2) name at least one failure mode under high-update rate (FM9 — incorrect outputs from stale trie state during concurrent writes), (3) address the AT3 tradeoff between snapshot-based trie rebuilds (simpler, higher latency for freshness) and incremental updates (lower latency, higher concurrency complexity), and (4) propose a concrete update pipeline (e.g., shadow trie with atomic swap) that maintains the 20ms P99 guarantee during updates.

Read in the book →

← Real Systems