Nomad
Resource quotas
Enterprise Only
The functionality described here is available only in Nomad Enterprise with the Governance & Policy module. To explore Nomad Enterprise features, you can sign up for a free 30-day trial from here.
Nomad Enterprise provides support for resource quotas, which allow operators to restrict the aggregate resource usage of namespaces. Once a quota specification is attached to a namespace, the Nomad cluster will count all resource usage by jobs in that namespace toward the quota limits. If the resource is exhausted, allocations within the namespaces will be queued until resources become available—by other jobs finishing or the quota being expanded.
In this guide, you'll create a quota that limits resources in the global region. You will then assign the quota to a namespace where you will deploy a job. Finally, you'll learn how to secure the quota with ACLs.
Define and create a quota
You can manage resource quotas by using the nomad quota
subcommand. To
get started with creating a quota specification, run nomad quota init
which
produces an example quota specification.
$ nomad quota init
Example quota specification written to spec.hcl
The file spec.hcl
defines a limit for the global region. Additional limits may
be specified in order to limit other regions.
# spec.hcl
name = "default-quota"
description = "Limit the shared default namespace"
# Create a limit for the global region. Additional limits may
# be specified in-order to limit other regions.
limit {
region = "global"
region_limit {
cores = 0
cpu = 2500
memory = 1000
memory_max = 1000
device "nvidia/gpu/1080ti" {
count = 1
}
}
variables_limit = 1000
}
Resource limits
When specifying resource limits the following enforcement behaviors are defined:
limit < 0
: A limit less than zero disallows any access to the resource.limit == 0
: A limit of zero allows unlimited access to the resource.limit > 0
: A limit greater than zero enforces that the consumption is less than or equal to the given limit.
Note: Limits that specify the count of devices work differently. A count > 0
enforces that the consumption is less than or equal to the given limit, and setting
it to 0
disallows any usage of the given device. Unlike other limits, device count
cannot be set to a negative value. To allow unlimited access to a device
resource, remove it from the region limit.
A quota specification is composed of one or more resource limits. Each limit applies to a particular Nomad region. Within the limit object, operators can specify the allowed CPU and memory usage.
Create the quota
To create the quota, run nomad quota apply
with the filename of your quota specification.
$ nomad quota apply spec.hcl
Successfully applied quota specification "default-quota"!
Check for success with nomad quota list
.
$ nomad quota list
Name Description
default-quota Limit the shared default namespace
Attach a quota to a namespace
In order for a quota to be enforced, you have to attach the quota specification
to a namespace. This can be done using the nomad namespace apply
command. Add
the new quota specification to the default
namespace as follows:
$ nomad namespace apply -quota default-quota default
Successfully applied namespace "default"!
View quota information
Now that you have attached a quota to a namespace, you can run a job in the default namespace.
Initialize the job.
$ nomad job init
Example job file written to example.nomad.hcl
Run the job in the default namespace.
$ nomad job run -detach example.nomad.hcl
Job registration successful
Evaluation ID: 985a1df8-0221-b891-5dc1-4d31ad4e2dc3
Check the status.
$ nomad quota status default-quota
Name = default-quota
Description = Limit the shared default namespace
Limits = 1
Quota Limits
Region CPU Usage Core Usage Memory Usage Memory Max Usage Variables Usage
global 0 / 2500 0 / inf 0 / 1000 0 / 1000 0 / 1000
Quota Device Limits
Region Device Name Device Usage
global nvidia/gpu/1080ti 1 / 1
Notice that the newly created job is accounted against the quota specification since it is being run in the namespace attached to the "default-quota" quota.
$ nomad job run -detach example.nomad.hcl
Job registration successful
Evaluation ID: ce8e1941-0189-b866-3dc4-7cd92dc38a69
Now, add more instances of the job by changing count = 1
to count = 4
and
check to see if additional allocations were added.
$ nomad status example
ID = example
Name = example
Submit Date = 10/16/17 10:51:32 PDT
Type = service
Priority = 50
Datacenters = dc1
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
cache 1 0 3 0 0 0
Placement Failure
Task Group "cache":
* Quota limit hit "memory exhausted (1024 needed > 1000 limit)"
Latest Deployment
ID = 7cd98a69
Status = running
Description = Deployment is running
Deployed
Task Group Desired Placed Healthy Unhealthy
cache 4 3 0 0
Allocations
ID Node ID Task Group Version Desired Status Created At
6d735236 81f72d90 cache 1 run running 10/16/17 10:51:32 PDT
ce8e1941 81f72d90 cache 1 run running 10/16/17 10:51:32 PDT
9b8e185e 81f72d90 cache 1 run running 10/16/17 10:51:24 PDT
The output indicates that Nomad created two more allocations but did not place the fourth allocation, which would have caused the quota to be oversubscribed on memory.
Secure quota with ACLs
Access to quotas can be restricted using ACLs. As an example, you could create an ACL policy that allows read-only access to quotas.
# Allow read only access to quotas.
quota {
policy = "read"
}
Proper ACLs are necessary to prevent users from bypassing quota enforcement by increasing or removing the quota specification.
Design for federated clusters
Nomad makes working with quotas in a federated cluster simple by replicating quota specifications from the authoritative Nomad region. This allows operators to interact with a single cluster but create quota specifications that apply to all Nomad clusters.
As an example, you can create a quota specification that applies to two regions:
name = "federated-example"
description = "A single quota spec affecting multiple regions"
# Create a limits for two regions
limit {
region = "europe"
region_limit {
cpu = 20000
memory = 10000
}
}
limit {
region = "asia"
region_limit {
cpu = 10000
memory = 5000
}
}
Apply the specification.
$ nomad quota apply spec.hcl
Successfully applied quota specification "federated-example"!
Now that the specification is applied and attached to a namespace with jobs in each region,
use the nomad quota status
command to observe how the enforcement applies
across federated clusters.
$ nomad quota status federated-example
Name = federated-example
Description = A single quota spec affecting multiple regions
Limits = 2
Quota Limits
Region CPU Usage Memory Usage
asia 2500 / 10000 1000 / 5000
europe 8800 / 20000 6000 / 10000
Learn more about quotas
For specific details about working with quotas, consult the quota commands and HTTP API documentation.