Load Balancing
Distributing network traffic across multiple servers to ensure no single server becomes overwhelmed, improving availability and performance.
Definition
Load balancers distribute incoming requests across multiple servers (a server pool). They monitor server health, route traffic to available instances, and remove failed servers from rotation—ensuring applications remain responsive under varying loads.
Algorithms vary: round-robin distributes sequentially, least connections routes to the server with fewest active requests, and weighted distribution sends more traffic to more capable servers.
Why It Matters
Load balancing is fundamental to scalable architecture. Without it, applications are limited to single-server capacity and have no redundancy—any failure means complete outage.
Modern applications expect load balancing. It enables horizontal scaling (adding servers) rather than vertical scaling (bigger servers), typically more cost-effective and resilient.
Examples in Practice
Netflix handles billions of daily requests by load balancing across thousands of servers globally. Their custom Zuul gateway routes traffic while providing filtering and monitoring.
A startup adds a load balancer when their single server hits capacity limits. Traffic now distributes across three servers, tripling capacity and providing redundancy if one fails.