Nomad
Dynamic Application Sizing concepts
What is Dynamic Application Sizing in Nomad?
Dynamic Application Sizing (DAS) was designed with the following goals in mind:
Reduce toil: running a Nomad job requires knowledge of how much CPU and memory to allocate. This is often an unknown value which results in a frustrating loop of trial and error “guesstimations”. But these values are far from set-and-forget settings. As your user-base increases or the job is updated with new code, the resource usage profile of the job is likely to change as well. This requires even more work to monitor and track new limits. DAS monitors your jobs and provide you with recommendations for new limit values automatically.
Maximize infrastructure usage: overestimating job limits can result in resource waste as servers sit idle while Nomad is unable to schedule other jobs in them due to resource constraints. DAS detects overprovisioned jobs and recommends lower limits based on actual resource usage.
Improve reliability: underprovisioned jobs can suffer from problems like Out of Memory (OOM) errors and CPU throttling, causing reliability issues and requiring additional SRE attention. DAS recommendations are based on actual usage and can detect when an application is starting to require more resources.
Prometheus Required
Currently, Prometheus is the only APM supported for Dynamic Application Sizing. You can test-drive DAS using Vagrant.
Enterprise Only
The functionality described here is available only in Nomad Enterprise with the Multi-Cluster & Efficiency module. To explore Nomad Enterprise features, you can sign up for a free 30-day trial from here.
How Dynamic Application Sizing works
Nomad's Dynamic Application Sizing (DAS) feature is comprised of three new components:
- Recommendations API
- Nomad vertical autoscaling policies
- DAS-specific plugins in nomad-autoscaler
Using the new DAS plugins, the Nomad autoscaler pulls a list of vertical scaling policies from Nomad. These policies indicate which jobs should be subject to DAS and specify the strategy for making resource recommendations. The autoscaler pulls historical information about resource utilization from the configured APM and then proceeds to continuously collect resource utilization. The configured DAS strategy plugin consumes these metrics, determines a new resource value, and submits that value to the Recommendation API in Nomad. The UI displays these recommendations, along with statistics about the tasks that were computed by the autoscaler. After reviewing the recommendation in the Nomad UI, users can choose to dismiss the recommendation or apply it; applying the recommendation updates the job.
Configure a cluster for DAS
Configuration for DAS occurs in 2 places:
Configuration of the autoscaler:
- which plugins to load
- where to find Nomad and the APM
- number of workers
Job specification with scaling blocks under a task:
- which jobs to analyze and which strategy to apply
- policy-specific settings, such as cooldown interval and strategy configuration
Configure Nomad for telemetry
Nomad needs to be configured to enable telemetry publishing. You need to enable
allocation and node metrics. Since this tutorial also uses Prometheus as its APM,
you need to set prometheus_metrics
to true. Add this telemetry block to the
configuration for every Nomad node in your cluster.
telemetry {
publish_allocation_metrics = true
publish_node_metrics = true
prometheus_metrics = true
}
Run and configure an APM
The autoscaler uses an application performance monitor or metrics platform to retrieve historical metrics when starting to track a new target. In this beta, the APM is also used for ongoing monitoring metrics, but this is currently being shifted to using Nomad's metrics API.
Run the autoscaler
The next step is to run the Nomad autoscaler. For the beta, an enterprise version of the Nomad Autoscaler is provided that includes the DAS plugins.
The quickest approach is to run the autoscaler as a Nomad job; however, you can download the Nomad Autoscaler and run it as a standalone process. You need to configure the Nomad autoscaler with the information necessary to connect to your Nomad cluster and the information necessary to connect to your APM.
Upon starting, the autoscaler loads the DAS-specific plugin and launches workers to evaluate vertical policies:
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=app-sizing-percentile
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=nomad-target
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=app-sizing-nomad
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=prometheus
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=app-sizing-avg
[INFO] agent.plugin_manager: successfully launched and dispensed plugin: plugin_name=app-sizing-max
[INFO] policy_eval.worker: starting worker: id=f6d205b3-9e48-ba9d-a230-9d3e8f2bdf81 queue=vertical_cpu [INFO] policy_eval.worker: starting worker: id=750bcea7-47af-94b3-820c-1770c757ed07 queue=vertical_mem
If there are already jobs configured with vertical policies, the autoscaler begins dispatching policy evaluations from the broker to the workers; otherwise, this occurs when vertical policies are added to a job specification:
[DEBUG] policy_eval.broker: dequeue eval: queue=vertical_mem
Note
The autoscaler does not immediately register recommendations.
The evaluate_after
field in the autoscaler configuration indicates the
amount of historical metrics that must be available before a recommendation
is made for a task. The purpose is to prevent recommendations with
insufficient historical information; without representative data,
appropriate recommendations cannot be made, which could result in
under-provisioning a task. For the purpose of evaluating the feature, this
can be reduced. For more production-like environments, this interval should
be long enough to capture a representative sample of metrics. The default
interval is 24 hours.
Add DAS to a job
In order to enable a Nomad job task for sizing recommendations, the following job specification contains a task scaling stanza for CPU and one for memory. These stanzas, when placed within a job specification's task stanza, configure the task for both CPU and memory recommendations.
Once the job has been registered with its updated specification, the Nomad autoscaler automatically detects the new scaling policies and start the required internal processes.
To enable application-sizing for multiple tasks with DAS, you need to add this scaling block to every new or additional task in the job spec. Below is a section on how you can further customize the application-sizing block to your needs (percentile, cooldown periods, sizing strategies).
Vertical autoscaling policies
In order to accommodate vertical application scaling, the Nomad job specification has been enhanced to allow scaling stanzas at the task level. The policy object follows the same syntax as horizontal policies with the exception that only a single check is allowed within a task scaling stanza. This validation is provided by the Nomad Autoscaler, and any validation error is displayed in the autoscaler's log output.
An example of a Dynamic Application Sizing policy targeting the CPU resource looks as follows:
task "app" {
scaling "cpu" {
enabled = true
min = 50
max = 2000
policy {
evaluation_interval = "30s"
cooldown = "5m"
check "95pct" {
strategy "app-sizing-percentile" {
percentile = "95"
}
}
}
}
}
The fields available are:
enabled
: indicates if the policy should be evaluated. It can be used to turn off recommendations for a task without having to remove the scaling stanza.min
: defines the minimum value for a recommendation. In this example, recommendations for the CPU resource value for this task must be at least 50 MHz. If this field is omitted, this defaults to 10 MHz for CPU and 1 MB for memory.max
: defines the maximum value for a recommendation. In this example, recommendations for the CPU resource value for this task are capped at 2000 MHz.evaluation_interval
: specifies how often the autoscaler should consult the policy to see if a new recommendation is necessary. The autoscaler consults the policy above once every 30 seconds.cooldown
: indicates the Nomad autoscaler should not attempt a new recommendation within 5 minutes of a previous submission.strategy
: use the app-sizing-percentile strategy plugin and use a percentile value of 95.
The autoscaler only provides recommendations for a given task resource if it includes a scaling stanza targeting that resource. The CPU stanza in the preceding example enables DAS for the task containing the block; enabling DAS for memory requires an explicit scaling block targeting memory, for example:
scaling "mem" {
# ...
}
Application sizing strategies
Nomad Autoscaler Enterprise delivers three new strategy plugins designed for
Dynamic Application Sizing feature. The app-sizing-max
, app-sizing-avg
,
and app-sizing-percentile
plugins are all launched automatically when running
Nomad Autoscaler Enterprise; further details of each plugin can be found below.
Each of these strategies is used to compute a recommendation. Before submitting
the recommendation to Nomad, the computed value is further post-processed to be:
- increased by 15% as a safety margin
- capped between the min and max values present in the policy
- capped by any Nomad minimum/maximum resource values.
App-Sizing-Max
The app-sizing-max
plugin calculates the maximum value seen for the target
resource within the available dataset. This plugin is ideally suited for memory
resources since workloads don’t release their memory too often and
under-provisioning could cause OOM errors.
App-Sizing-Percentile
The app-sizing-percentile
plugin calculates its result based on a desired
percentile value from the dataset. The percentile value defaults to 99
but is
configurable via the strategy config block and supports any value between 1 to
100:
strategy "app-sizing-percentile" {
percentile = "90"
}
The plugin applies an exponentially decaying weight to the data, in order to give more significance to recent values over older ones. It also adjusts its calculation based on the amount of resources used per unit of time. This load-adjusted calculation results in values that are more likely to actually meet the usage needs of the workload when compared to the traditional time-based percentile calculation.
This plugin is the most versatile, since the percentile level can be fine-tuned
as needed. If your workload can withstand occasional OOM errors gracefully,
using a 98th percentile for memory instead of app-sizing-max
could result in
smaller recommendations and subsequently more resource availability for other
tasks. A 95th to 90th percentile for CPU could have the same effect.
App-Sizing-Avg
The app-sizing-avg
plugin calculates the average value seen across the
dataset. The plugin applies an exponentially decaying weight to the data, in
order to give more significance to recent values over older ones.
This plugin is only recommended for CPU values of workloads with very stable resource usage levels, such as batch jobs.
Maximize your DAS outcomes
Maximizing the benefit from DAS involves:
- tuning the strategy to get good recommendations
- effectively reviewing job recommendations
- impact of DAS on automation
Strategy suggestions
CPU Batch: usually will use the
app-sizing-avg
strategy as this calculates the most efficient limit.CPU Service/System: most commonly will use the
app-sizing-percentile
strategy with a percentile value between 90-99. The more latency sensitive the application is, the higher the percentile value should be.Memory: the strategy to use for memory largely depends on the OOM tolerance of the application. Tasks with low tolerance will commonly use the
app-sizing-percentile
strategy with a percentile value of 98 or 99. Tasks with minimal tolerance, or in situations where you are unsure, should use theapp-sizing-max
strategy.
Review DAS recommendations
Once the autoscaler has generated recommendations, you can review them in the Nomad UI or using the Nomad API and accept or dismiss the recommendations.
Because manual review is time-consuming, Nomad includes information in the recommendations API to support automatic notification of significant recommendations. The Nomad API has been enhanced to accommodate Dynamic Application Sizing with updates and new endpoints, which allow recommendations to be programmatically reviewed and applied/dismissed.
While building confidence in DAS strategies and their tuning, operators may prefer to manually review the recommendations. Two sources of information are be useful in reviewing a given recommendation: the historical metrics in your APM and the statistics collected in the recommendation.
The Nomad server emits metrics to the configured APM about the number of outstanding recommendations, as well as the effect of applying the recommendations. Monitoring these metrics can indicate when human review is appropriate. These are grouped on a per-namespace basis, and take the following form:
[
{
"Labels": {
"host": "server1.localdomain",
"namespace": "default"
},
"Name": "nomad.nomad.recommendations.num_recommendations",
"Value": 2.0
},
{
"Labels": {
"host": "server1.localdomain",
"namespace": "default"
},
"Name": "nomad.nomad.recommendations.total_diff_cpu_ticks",
"Value": -500.0
},
{
"Labels": {
"host": "server1.localdomain",
"namespace": "default"
},
"Name": "nomad.nomad.recommendations.total_diff_memory_bytes",
"Value": -1024.0
}
]
The recommendations API can be used to list outstanding recommendations, on a global or per-namespace basis.
Dismissing the recommendation causes it to disappear. However, the autoscaler continues to monitor and eventually makes additional recommendations for the job until the vertical scaling policy is removed from the job specification.