Elastic Load Balancing and Auto Scaling groups
Elastic Load Balancing
Load balancer are servers that forwards traffic to multiple servers downstream
Why use a load balancer
- Spread load across multiple downstream instances
- Expose a single point of access to your application
- Seamlessly handle failures of downstream instances
- Do regular health-check to your instances
- Provide SSL termination (HTTPS) for your website
- Enforce stickiness with cookies
- High availability across zones
- Separate public traffic from private traffic
Types of load balancer on AWS
AWS provides 4 kinds of managed load balancer:
- Classic load balancer (v1 - old generation)
- HTTP, HTTPS, TCP, SSL
- Application load balancer (v2 - new generation) - Layer 7
- HTTP, HTTPS, gRPC, WebSocket
- Network load balancer (v2 - new generation) - Layer 4
- TCP (HTTP, HTTPS), TLS, UDP
- Gateway load balancer - Layer 3
Note: when using ALB, the application servers don't see the IP of the client directly. The true IP of the client is inserted in the header X-Forwarded-For
Sticky session
- Its possible to implement stickiness so that the same client is always redirected to the same instance behind a load balancer.
- This works for CLB and ALB
- The cookie used for stickiness has an expiration date you control
- Enabling stickiness may bring imbalance to the load over the backend EC2 instances
Use case:
- Make sure the user does not lose his session data
Cross zone load balancing
Application Load Balancer:
- Always on (can't be disabled)
- No charges for inter AZ data
Network Load Balancer
- Disabled by default
- You pay charges ($) for inter AZ data if enabled
Classic Load Balancer
- Disabled by default
- No charges for inter AZ data
SSL/ TLS
Classic Load Balancer
- Support only one SSL certificate
Application Load Balancer
- Support multiple listeners with multiple SSL certificate
- Use Server Name Indication (SNI) to make it work
Network Load Balancer
- Support multiple listeners with multiple SSL certificate
- Use Server Name Indication (SNI) to make it work
Connection draining
It's a time to complete in-flight
requests while the instance is de-registering or unhealthy.
ELB will stop send new requests to the EC2 instance which is de-registering
Auto Scaling Groups
The goal of the ASG is to:
- Scale out (add EC2 instances) to match an increased load
- Scale in (remove EC2 instances) to match an decreased load
- Ensure we have a minimum and a maximum number of EC2 instances running
- Automatically register new instances to a load balancer
- Re-create EC2 instance in case a previous one is terminated
Auto Scaling Groups attributes
- A launch template: contains template for creating a new EC2 instances
- Min Size/ Max Size/ Initial Capacity
- Scaling policy
It's possible to scale an ASG based on CloudWatch alarm.
Good metrics to scale on
- CPUUtilization: Average CPU utilization across your instances
- RequetsCountPerTarget: to make sure the number of requests per EC2 instance is stable
- Average Network In/ Out: if your application is network bound
- Custom metric
Scaling cooldown
After a scaling activity happens, you are in the cooldown period (default 300 seconds), during cooldown period, ASG will not launch or terminate additional instances (to allow for metrics to stablize)