Make your app survive traffic spikes and server failures: bake a launch template, put servers behind an Application Load Balancer, and let an Auto Scaling group add and remove instances automatically with scaling policies.
Two problems one server cannot solve: too much traffic, and the server dying. The fix: run several identical servers behind a load balancer, and let an Auto Scaling group keep the right number alive. Why: your app stays up when a server fails and grows when demand rises — automatically.
This lesson builds, in order: 1. a Launch Template (the blueprint for each server) 2. a Load Balancer (spreads traffic across servers) 3. an Auto Scaling Group (keeps N healthy servers running) 4. a Scaling Policy (changes N based on load)
echo "Reuse the VPC, two subnets, AMI_ID and SG_ID from earlier lessons."A launch template captures everything needed to start an instance — AMI, type, security group, user data — so every server in your fleet is identical. Why: Auto Scaling needs one recipe to stamp out copies. It replaces the older "launch configuration," which AWS has deprecated.
The user-data must be base64-encoded inside a launch template
USERDATA=$(base64 -w0 bootstrap.sh)aws ec2 create-launch-template \
--launch-template-name web-template \
--launch-template-data '{"ImageId": "'$AMI_ID'","InstanceType": "t3.micro","SecurityGroupIds": ["'$SG_ID'"],"UserData": "'$USERDATA'"}'A load balancer gives clients a single address and spreads their requests across all your healthy servers. An ALB (Application Load Balancer) works at the HTTP level. Why: clients never talk to a server directly, so you can add, remove, or replace servers without anyone noticing.
Create the load balancer across two public subnets (two AZs = resilient)
ALB_ARN=$(aws elbv2 create-load-balancer --name web-alb \
--subnets $PUBLIC_SUBNET $PUBLIC_SUBNET_B \
--security-groups $SG_ID \
--query 'LoadBalancers[0].LoadBalancerArn' --output text)The public DNS name clients will hit
aws elbv2 describe-load-balancers --load-balancer-arns $ALB_ARN \
--query 'LoadBalancers[0].DNSName' --output textA target group is the pool of servers the load balancer sends traffic to, plus a health check it runs against each. Why health checks: the ALB pings a URL (say /health); if a server stops replying, it is pulled out of rotation so users never hit a broken instance.
Create a target group that health-checks port 80 at "/"
TG_ARN=$(aws elbv2 create-target-group --name web-targets \
--protocol HTTP --port 80 --vpc-id $VPC_ID \
--health-check-path / \
--query 'TargetGroups[0].TargetGroupArn' --output text)Tell the load balancer to forward incoming HTTP to that target group
aws elbv2 create-listener --load-balancer-arn $ALB_ARN \
--protocol HTTP --port 80 \
--default-actions Type=forward,TargetGroupArn=$TG_ARNAn Auto Scaling Group (ASG) launches from your template and guarantees a count: a minimum, a maximum, and a "desired" number. Why: if a server fails its health check the ASG replaces it; if you raise the desired count it adds servers — all hands-off.
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name web-asg \
--launch-template LaunchTemplateName=web-template \
--min-size 2 --max-size 6 --desired-capacity 2 \
--vpc-zone-identifier "$PUBLIC_SUBNET,$PUBLIC_SUBNET_B" \
--target-group-arns $TG_ARN \
--health-check-type ELBA scaling policy is the rule that changes the desired count based on a metric. "Target tracking" is the easy one: name a target (e.g. average CPU 50%) and AWS adds or removes servers to hold it. Why: you pay for capacity only when traffic justifies it.
Keep average CPU across the group at 50%
aws autoscaling put-scaling-policy \
--auto-scaling-group-name web-asg \
--policy-name cpu50 \
--policy-type TargetTrackingScaling \
--target-tracking-configuration '{"PredefinedMetricSpecification": {"PredefinedMetricType": "ASGAverageCPUUtilization"},"TargetValue": 50.0}'