Step 1: Separate Functional from Non-Functional Requirements
Functional requirements describe what the system does. Non-functional requirements describe how well.
Functional (URL shortener):
- User submits a long URL, receives a short code
- User submits a short code, is redirected to the long URL
- User can view click statistics for their URLs
Non-functional:
- Redirect latency: p99 < 20ms
- Availability: 99.99% (52 minutes/year downtime)
- Read QPS: 10,000 redirects/sec at peak
- Write QPS: 100 new URLs/sec
- Data retention: 5 years
- Consistency: eventual acceptable for click counts
Non-functional requirements are where architecture decisions live. The latency target of 20ms at p99 immediately implies caching — a database-only redirect cannot achieve that at 10,000 QPS. The 99.99% availability target implies redundancy and no single points of failure.
Step 2: Back-of-Envelope Estimation
Estimation uses four standard conversions:
Time: 1 day = 86,400 seconds ≈ 10^5
Storage: 1 char ≈ 1 byte; 1 KB = 10^3 bytes; 1 MB = 10^6; 1 GB = 10^9; 1 TB = 10^12
Network: typical server NIC = 1–10 Gbps = 125 MB/s – 1.25 GB/s
Memory: typical server RAM = 64–256 GB
QPS formula: events_per_day / 86,400
Storage formula: records_per_day × record_size × retention_days
Bandwidth formula: QPS × average_response_size
Worked example for a messaging system with 100M DAU:
Write QPS:
100M users × 10 messages/day / 86,400 ≈ 11,600 writes/sec ≈ 12,000
Read QPS (read:write ratio = 10:1):
12,000 × 10 = 120,000 reads/sec
Storage per message: 1 KB (text + metadata)
Daily storage: 12,000 × 86,400 × 1 KB = 1,036,800,000 KB ≈ 1 TB/day
5-year storage: 1 TB × 365 × 5 = 1,825 TB ≈ 1.8 PB
Bandwidth (reads): 120,000 reads/sec × 1 KB = 120 MB/s
These numbers immediately constrain the architecture. 1.8 PB rules out a single server. 120,000 reads/sec rules out an uncached database. 12,000 writes/sec rules out synchronous write acknowledgement on anything slower than an SSD-backed system.
Step 3: Characterise the Read/Write Profile
The read/write ratio determines the primary architectural pattern.
Read-heavy (ratio > 10:1): caching is essential. Read replicas reduce load on the primary. CDN handles static or slowly-changing data. The design optimises for fast reads and tolerates slightly higher write latency.
Write-heavy (ratio < 2:1): buffered write paths matter. Async queues absorb write spikes. Append-only logs are cheaper than random-write databases. The design optimises for write throughput and may batch or delay reads.
Balanced (ratio 2:1 to 10:1): no single optimisation dominates. OLTP databases handle this well up to a point. Horizontal sharding is the first evolution step.
Step 4: Determine Consistency Requirements
Consistency requirements determine which storage and replication strategies are admissible.
Strong consistency: every read sees the most recent write. Required for financial transactions, inventory counts, authentication state. Forces synchronous replication. Rules out multi-region active-active without coordination.
Eventual consistency: reads may see stale data for a bounded time. Acceptable for social feeds, analytics, recommendation scores, notification badges. Enables async replication, higher availability, lower write latency.
Strong consistency → synchronous replication → higher write latency
Eventual consistency → async replication → potential stale reads
Step 5: Map to Architecture Constraints
From the characterised requirements, derive constraints:
If p99 read latency < 50ms AND read QPS > 10,000 → caching required
If write QPS > 5,000 → consider write buffering or async ingest
If storage > 10 TB → sharding or distributed storage required
If availability > 99.99% → no SPOF, redundancy at every layer
If strong consistency required → cannot use async replication
These constraints eliminate architecture classes and leave a narrow space of viable designs.