Start With Scope and Success Metrics
Open by clarifying the problem and naming measurable targets. Ask for expected QPS, latency percentiles to optimize (p95 vs p99), data size, read/write mix, and availability needs. Convert fuzzy goals into a minimal set of service level objectives you can design to.
Assumption‑First Communication
- State assumptions explicitly and confirm them with the interviewer.
- Call out constraints that drive design: multi‑region, strict consistency, or hard latency limits.
- Offer two viable options and pick a default to keep momentum.
Back‑of‑the‑Envelope Estimation
Size the system before picking components. Estimate request rates, data footprints, cache hit ratios, and storage growth. Use round numbers and keep math visible so trade‑offs are anchored to scale.
Quick Capacity Anchors
- Throughput: peak QPS, burst factor, producer/consumer rates.
- Storage: object sizes, daily writes, and retention horizons.
- Caching: read skew, hot keys, and invalidation triggers.
High‑Level Architecture First
Sketch client, API gateway, services, durable queues, and storage tiers. Identify where to add caches, which data needs strong consistency, and which paths can be asynchronous to improve tail latency.
Component Checklist
- API contracts with pagination, idempotency, and clear error modes.
- Data modeling with primary indexes and query patterns in mind.
- Sharding and replication strategy, plus failover expectations.
Trade‑Offs the Interviewer Expects
Use precise language: availability, consistency, throughput, replication lag, and quorum. Explain how your choices affect user impact and cost. If you pick availability under partition, define conflict resolution; if you prioritize consistency, describe leader election and quorum sizing.
Operational Readiness
- Health checks, golden signals (latency, errors, saturation), and alert thresholds.
- Backpressure, retries with jitter, and circuit breakers for downstream safety.
- Blue‑green or canary rollouts and a rollback plan.
Deep‑Dive Paths
Be ready to zoom in. For storage, discuss write‑ahead logging, indexes, and hot partitions. For queues, cover delivery semantics and consumer scaling. For rate limiting, outline token bucket configs and multi‑tenant fairness.
Example Prompts to Practice
- Design a photo service: upload pipeline, metadata store, CDN, and thumbnails.
- Design a social feed: write fan‑out vs read fan‑in, backfill, and cache warming.
- Design a chat system: presence, ordering, and mobile sync under flaky networks.
Close With a Roadmap
End with an MVP and iterations: baseline read path, async writes, basic monitoring; then add sharding, background processing, and multi‑region. This shows ownership and an understanding of operational stages.
Practice under a timer. Use consistent templates for diagrams and trade‑off language so your thinking stays clear when the stakes are high.
