URL Shortener System Design: The Gotchas

The interviewer says: "Design a URL shortener." It sounds trivial. Store a long URL, hand back a short code, redirect when someone visits it. Most engineers reach for the marker and start drawing boxes — a server, a database, a load balancer. Twenty minutes later the interviewer asks what happens at a billion redirects a day, and the boxes stop making sense.

A URL shortener is the first system every engineer gets wrong because it looks like a CRUD app and behaves like a distributed system. The failure is not drawing the wrong boxes. It is drawing boxes before understanding what the system must do, at what scale, under which failures, and how it will change. Here is the disciplined approach — five phases, in order.

Phase 1: Clarify Requirements

Separate functional requirements (what it does) from non-functional (how well).

Functional: Given a long URL, return a short code. Given a short code, redirect to the long URL. Optionally, count clicks.

Non-functional: Redirect latency under 50ms at P99. 99.99% availability — a shortener that is down breaks every link that was ever created with it. Links never expire (or expire on a stated policy). Short codes are not guessable if privacy matters.

Write both lists down before drawing anything. The non-functional list is what turns this from a CRUD app into a system design problem.

Phase 2: Estimate Scale

Back-of-envelope estimation converts the product into engineering constraints.

New URLs:        100M / month  → ~40 writes/sec
Read:write ratio:  100:1       (links are read far more than created)
Redirects:       ~4,000 reads/sec, peaking ~10,000/sec
Storage/row:     long URL + code + metadata ≈ 500 bytes
5-year storage:  100M × 12 × 5 × 500 B ≈ 3 TB

These numbers force decisions. 10,000 reads/sec cannot be served politely by a single database under 50ms P99. 3 TB fits on one large disk today — but a hot read path at that volume must not touch disk on every request. The estimate eliminates the naive single-database design before a box is drawn.

Phase 3: The Decision That Defines the System — Generating the Short Code

Everything interesting about a URL shortener is in one decision: how do you produce the short code? There are three candidates, and picking wrong is how engineers get this system wrong.

Option A — Hash the URL (e.g. MD5, take first 7 chars). Deterministic and stateless. But hashing distinct URLs into a 7-character space will collide, and a collision means one URL silently redirects to another's destination — FM9, silent data corruption. Handling collisions requires a read-before-write to check, which destroys the "stateless" benefit.

Option B — Random code, check for collision. Generate 7 random characters, check the database, retry on collision. Correct, but every write now does a read first, and as the keyspace fills, retries climb.

Option C — A global counter, base-62 encoded. Maintain a monotonic counter. Each new URL takes the next integer; encode it in base-62 (a–z A–Z 0–9) to get a short string. No collisions, ever — each integer is unique by construction. Counter value 1,000,000,000 encodes to just 6 characters.

counter = 125          → base62 → "cb"
counter = 1,000,000    → base62 → "4c92"
counter = 1,000,000,000 → base62 → "15ftgG"

Option C is the right default — but it relocates the hard problem. A single global counter is now a single point of failure (FM1) and a write bottleneck (FM6, hotspotting): every URL creation in the world contends on one integer.

The fix is AT5 — Centralisation vs Distribution. Do not hand out one integer at a time from a central counter. Hand out ranges. Each application server requests a block of 1,000 IDs, serves them locally, and returns for another block when it runs low. The central counter is touched once per thousand writes, not once per write. The hotspot dissolves.

Central allocator
   │  "you get 1,000,000–1,000,999"
   ▼
App server A  → serves 1,000 codes with zero coordination
App server B  → holds range 1,001,000–1,001,999

Phase 4: The Read Path — Where the Traffic Actually Is

The read:write ratio is 100:1. The redirect path, not the creation path, is the system. A redirect is a single lookup: short code → long URL. That maps cleanly to the Read-Heavy Store archetype, and the answer is a cache.

This is AT4 — Precomputation vs On-Demand. Keep a cache (Redis, or in-process) mapping codes to URLs. The mapping is immutable once created — a short code's destination never changes — which means the cache never goes stale and needs no invalidation logic. Cold-start is the only cost: a freshly started cache misses until it warms. With a 99% hit rate, the database sees 100 reads/sec instead of 10,000.

The accompanying tradeoff is AT1 — Consistency vs Availability — but only for click counts. The redirect itself must be correct. The click counter can be eventually consistent: increment it asynchronously, tolerate a few seconds of lag, and the redirect never blocks on a write. Nobody's link breaks because the click count was 5 seconds behind.

Phase 5: How It Fails, and How It Evolves

Name the failure modes before declaring the design done:

FM1 — Single Point of Failure. The ID allocator and the database both need redundancy. Range-based allocation helps: if the allocator is briefly down, every app server keeps serving from its local range.
FM6 — Hotspotting. Solved in Phase 3 by handing out ranges instead of single IDs.
FM11 — Observability Blindness. You must be able to answer "what is the redirect P99 right now" and "what is the cache hit rate" without a deploy. Build the metrics in from the start.

At 10× scale, the bottleneck moves to the database read replicas; at 100×, the URL table gets sharded by short code. The design does not change — the five phases still apply. Scale only changes which solutions are on the table.

The One Sentence

A URL shortener is easy to draw and hard to design, because its entire difficulty hides in two decisions — how you generate a collision-free code without a central bottleneck, and how you serve a 100:1 read load without touching disk. Name those two tradeoffs (AT5 and AT4), and the boxes finally make sense.