App Scaling
Learn about scaling Apps CPU, RAM, and containers - manually or automatically
Overview
Aptible Apps are scaled at the Service level, meaning each App Service is scaled independently.
App Services can be scaled by adding more CPU/RAM (vertical scaling) or by adding more containers (horizontal). App Services can be scaled manually via the CLI or UI, automatically with the Autoscaling, or programmatically with Terraform.
Apps with more than two containers are deployed in a high-availability configuration, ensuring redundancy across different zones.
When Apps are scaled, a new set of containers will be launched to replace the existing ones for each of your App’s Services.
High-availability Apps
Apps scaled to 2 or more Containers are automatically deployed in a high-availability configuration, with Containers deployed in separate AWS Availability Zones.
Horizontal Scaling
Scale Apps horizontally by adding more Containers to a given Service. Each App Service can scale up to 32 Containers.‘
Manual Horizontial Scaling
App Services can be manually scaled via the Dashboard or aptible apps:scale
CLI command. Example:
Horizontal Autoscaling
When Horizontal Autoscaling is enabled on a Service, the autoscaler evaluates Services every 5 minutes and generates scaling adjusted based on CPU usage (as percentage of total cores). Data is analyzed over a 30-minute period, with post-scaling cooldowns of 5 minutes for scaling down and 1 minute for scaling up. After any release, an additional 5-minute cooldown applies. Metrics are evaluated at the 99th percentile aggregated across all of the service containers over the past 30 minutes.
Guide for Configuring Horizontial Autoscaling
Configuration Options
Vertical Scaling
Scale Apps vertically by changing the size of Containers, i.e., changing their Memory Limits and CPU Limits. The available sizes are determined by the Container Profile.
Manual Vertical Scaling
App Services can be manually scaled via the Dashboard or aptible apps:scale
CLI command. Example:
Vertical Autoscaling
When Vertical Autoscaling is enabled on a Service, the autoscaler also evaluates services every 5 minutes and generates scaling recommendations based:
- RSS usage in GB divided by the CPU
- RSS usage levels
Data is analyzed over a 30-minute lookback period. Post-scaling cooldowns are 5 minutes for scaling down and 1 minute for scaling up. An additional 5-minute cooldown applies after a service release. Metrics are evaluated at the 99th percentile aggregated across all of the service containers over the past 30 minutes.
Configuration Options
FAQ
Was this page helpful?