Load Balancing Interview Questions
February 26, 2026 • By Surya Singh • System Design • Load Balancing • Scalability • Interview
Load balancing interview questions — round-robin, least connections, consistent hashing.
Key Takeaways
- 1Load balancers distribute traffic across multiple servers to improve throughput and availability.
- 2Layer 4 (TCP) balances by IP/port; Layer 7 (HTTP) can route by path, headers, or cookies.
- 3Algorithms: Round Robin, Least Connections, Weighted, IP Hash, Consistent Hashing.
- 4Health checks remove unhealthy servers; session affinity (sticky sessions) may be needed for stateful apps.
The questions below are commonly asked in technical interviews. Each answer is written to help you understand the concept clearly and explain it confidently. Focus on understanding the "why" behind each answer—that is what interviewers care about.
In this guide
Interview Questions & Answers
What is load balancing and why is it needed?
A load balancer sits in front of multiple servers and distributes incoming requests among them. It increases capacity (multiple servers handle more traffic than one), improves availability (if one server fails, others take over), and can reduce latency by routing to the closest or least busy server. Without a load balancer, a single server is a bottleneck and a single point of failure. Load balancers can be hardware (F5, Citrix) or software (Nginx, HAProxy, AWS ALB, Envoy).
What is the difference between Layer 4 and Layer 7 load balancing?
Layer 4 (L4) operates at the transport layer (TCP/UDP). The load balancer sees only IP addresses and ports—it forwards packets without inspecting the payload. It is fast and simple but cannot route based on URL, headers, or cookies. Layer 7 (L7) operates at the application layer (HTTP). It can inspect the request and route based on path (/api vs /static), host header, or cookies. This enables more sophisticated routing (e.g., send /api to API servers, /images to CDN) and SSL termination at the load balancer. L7 has higher overhead than L4.
When would I use Round Robin vs Least Connections?
Round Robin sends each request to the next server in sequence. It is simple and works well when all servers have similar capacity and request processing time. Least Connections sends each request to the server with the fewest active connections. Use it when requests have varying duration—for example, some API calls finish in 10 ms and others take 5 seconds. Round Robin could overload a server that is still processing long requests; Least Connections naturally balances by current load. For very short, uniform requests, both are similar.
What is consistent hashing and why is it used in load balancers?
Consistent hashing maps both servers and requests to a ring. Each server gets one or more points on the ring (virtual nodes). A request is routed to the first server clockwise from its hash. When a server is added or removed, only the keys near that server are remapped—most keys stay on the same server. This minimizes disruption when the set of servers changes, which is important for caching (you do not want all cache keys to shuffle when one node goes down). Used in distributed caches (Redis Cluster, Memcached) and CDNs.
How do health checks work in load balancers?
The load balancer periodically sends a request (HTTP, TCP, or custom) to each backend server. If the server responds successfully within a timeout, it is considered healthy. If it fails or times out, it is marked unhealthy and removed from the pool. Health checks can be active (load balancer probes the server) or passive (observe real request success/failure). Typical settings: check every 5–10 seconds, mark unhealthy after 2–3 consecutive failures, mark healthy after 2–3 consecutive successes. This prevents traffic from going to crashed or overloaded servers.
Loading...
Related Interview Guides
- All System Design & Scalability Topics
- System Design Interview Questions
- React Interview Questions
- SQL Interview Questions
- Interview Preparation — Start Here
Surya Singh
Azure Solutions Architect & AI Engineer
Microsoft-certified Azure Solutions Architect with 8+ years in enterprise software, cloud architecture, and AI/ML deployment. I build production AI systems and write about what actually works—based on shipping code, not theory.
- Microsoft Certified: Azure Solutions Architect Expert
- Built 20+ production AI/ML pipelines on Azure
- 8+ years in .NET, C#, and cloud-native architecture