A naive approach runs the full ranking model on all items for each request. If the catalogue has 10M items and the model takes 1ms per item, a single recommendation request requires 10,000 seconds of compute. This fails immediately.
The second naive approach is pure collaborative filtering at query time: compute cosine similarity between the requesting user and all other users, find the most similar users, return items those users liked. With 100M users, pairwise similarity computation takes hours, not milliseconds.
The production solution is a multi-stage pipeline: candidate generation narrows 10M items to 1,000; scoring ranks those 1,000 items; re-ranking applies business rules. Each stage is progressively more expensive and progressively more accurate.