The naive approach is to skip estimation and design for “scale” in the abstract — adding a load balancer, a caching layer, and read replicas because “that’s what scalable systems look like.” This produces an architecture that looks sophisticated but may be wildly over- or under-engineered for the actual load.
Over-engineering at low scale adds operational complexity that slows the team down. Under-engineering at high scale produces systems that fall over under real load. A URL shortener does not need a distributed Cassandra cluster at ten thousand daily active users. A social network with one hundred million DAU cannot survive on a single relational database.
The second failure: ignoring the read/write ratio. A system that is 95% reads requires a completely different architecture from one that is 95% writes. Read-heavy systems benefit from caching and read replicas. Write-heavy systems benefit from async ingest, write buffers, and append-only logs. Treating both the same produces a design that serves neither well.