Nomad
spread Block
Placement | job -> spread job -> group -> spread |
The spread
block allows operators to increase the failure tolerance of their
applications by specifying a node attribute that allocations should be spread
over. This allows operators to spread allocations over attributes such as
datacenter, availability zone, or even rack in a physical datacenter.
By default, when spread
is omitted, the scheduler will attempt to place
allocations from the same job on different nodes (and binpacked between
jobs). When using spread
the scheduler will attempt to place allocations
equally among the available values of the given target.
job "docs" {
# Spread allocations over all datacenter
spread {
attribute = "${node.datacenter}"
}
group "example" {
# Spread allocations over each rack based on desired percentage
spread {
attribute = "${meta.rack}"
target "r1" {
percent = 60
}
target "r2" {
percent = 40
}
}
}
}
Nodes are scored according to how closely they match the desired target percentage defined in the spread block. Spread scores are combined with other scoring factors such as bin packing.
A job or task group can have more than one spread criteria, with weights to express relative preference.
Spread criteria are treated as a soft preference by the Nomad scheduler. If no nodes match a given spread criteria, placement is still successful. To avoid scoring every node for every placement, allocations may not be perfectly spread. Spread works best on attributes with similar number of nodes: identically configured racks or similarly configured datacenters.
Spread may be expressed on attributes or client metadata. Additionally, spread may be specified at the job and group levels for ultimate flexibility. Job level spread criteria are inherited by all task groups in the job.
spread
Parameters
attribute
(string: "")
- Specifies the name or reference of the attribute to use. This can be any of the Nomad interpolated values.target
(target: <required>)
- Specifies one or more target percentages for each value of theattribute
in the spread block. If this is omitted, Nomad will spread allocations evenly across all values of the attribute.weight
(integer:0)
- Specifies a weight for the spread block. The weight is used during scoring and must be an integer between 0 to 100. Weights can be used when there is more than one spread or affinity block to express relative preference across them.
target
Parameters
value
(string:"")
- Specifies a target value of the attribute from aspread
block.percent
(integer:0)
- Specifies the percentage associated with the target value.
Comparison to spread
Scheduling Algorithm
The spread
block is not the same concept as setting the scheduler
algorithm to "spread"
instead of "binpack"
. Setting the scheduler
algorithm impacts all jobs on a cluster (or node pool), and adjusts the tendency
of the scheduler to place workloads from different jobs on the same set of nodes
or not. The spread
block impacts how the scheduler places allocations for a
given job.
Scheduling Performance
Using the spread
block can have significant impact on scheduling
performance. For each allocation in a service
and batch
job, the scheduler
iterates over nodes until it finds a small number of feasible nodes. Those
feasible nodes are then scored to find the best placement.
When spread
is omitted, this limit is 2 for batch jobs and the log2
of the total number of nodes in the datacenter and node pool (with a minimum of
2) for service jobs. When the spread
block is present, the scheduler instead
scores a number of nodes in the datacenter and node pool equal to the task group
count (with a maximum of 100) per allocation. This can result in
order-of-magnitude increases in scheduling times.
To monitor scheduling times potentially impacted by spread
blocks, examine the
nomad.nomad.worker.invoke_scheduler.*
found in the Key Metrics table. You
can reduce scheduling times by avoiding spread
and instead relying on the
default distribution of a job across multiple nodes. If this is not possible,
you may consider reducing the size of the node pool or datacenter to reduce the
number of nodes available for the scheduler to consider.
spread
Examples
The following examples show different ways to use the spread
block.
Even Spread Across Data Center
This example shows a spread block across the node's datacenter
attribute. If we have
two datacenters us-east1
and us-west1
, and a task group of count = 10
,
Nomad will attempt to place 5 allocations in each datacenter.
spread {
attribute = "${node.datacenter}"
weight = 100
}
Spread With Target Percentages
This example shows a spread block that specifies one target percentage. If we
have three datacenters us-east1
, us-east2
, and us-west1
, and a task group
of count = 10
, Nomad will attempt to place 5 of the allocations in "us-east1",
and will spread the remaining among the other two datacenters.
spread {
attribute = "${node.datacenter}"
weight = 100
target "us-east1" {
percent = 50
}
}
This example shows a spread block that specifies target percentages for two
different datacenters. If we have two datacenters us-east1
and us-west1
,
and a task group of count = 10
, Nomad will attempt to place 6 allocations
in us-east1
and 4 in us-west1
.
spread {
attribute = "${node.datacenter}"
weight = 100
target "us-east1" {
percent = 60
}
target "us-west1" {
percent = 40
}
}
Spread Across Multiple Attributes
This example shows spread blocks with multiple attributes. Consider a Nomad cluster
where there are two datacenters us-east1
and us-west1
, and each datacenter has nodes
with ${meta.rack}
being r1
or r2
. With the following spread block used on a job with count=12
, Nomad
will attempt to place 6 allocations in each datacenter. Within a datacenter, Nomad will
attempt to place 3 allocations in nodes on rack r1
, and 3 allocations in nodes on rack r2
.
spread {
attribute = "${node.datacenter}"
weight = 50
}
spread {
attribute = "${meta.rack}"
weight = 50
}