Fundamentals Scaling
Scaling
Techniques to handle more traffic and data while keeping reliability, latency, and cost under control.
Vertical scaling (scale up)
Bigger machine: more CPU/RAM/IO. Simple, but hard limits and single-node failures.
Horizontal scaling (scale out)
More machines behind a load balancer. Adds complexity: coordination, partitions, consistency.
Caching
Reduce repeated work. Fast reads, but you must manage invalidation and staleness.
Partitioning/sharding
Split data by key to distribute load. Requires careful key design, rebalancing, and cross-shard queries.
Rule of Thumb
Start simple
Scale up first when you can; introduce distributed complexity only when needed.
Measure, then optimize
Use SLOs and profiling to find the real bottleneck: CPU, DB, network, locks, or tail latency.