mirror of
https://github.com/kemko/nomad.git
synced 2026-01-04 17:35:43 +03:00
* Move commands from docs to its own root-level directory * temporarily use modified dev-portal branch with nomad ia changes * explicitly clone nomad ia exp branch * retrigger build, fixed dev-portal broken build * architecture, concepts and get started individual pages * fix get started section destinations * reference section * update repo comment in website-build.sh to show branch * docs nav file update capitalization * update capitalization to force deploy * remove nomad-vs-kubernetes dir; move content to what is nomad pg * job section * Nomad operations category, deploy section * operations category, govern section * operations - manage * operations/scale; concepts scheduling fix * networking * monitor * secure section * remote auth-methods folder and move up pages to sso; linkcheck * Fix install2deploy redirects * fix architecture redirects * Job section: Add missing section index pages * Add section index pages so breadcrumbs build correctly * concepts/index fix front matter indentation * move task driver plugin config to new deploy section * Finish adding full URL to tutorials links in nav * change SSO to Authentication in nav and file system * Docs NomadIA: Move tutorials into NomadIA branch (#26132) * Move governance and policy from tutorials to docs * Move tutorials content to job-declare section * run jobs section * stateful workloads * advanced job scheduling * deploy section * manage section * monitor section * secure/acl and secure/authorization * fix example that contains an unseal key in real format * remove images from sso-vault * secure/traffic * secure/workload-identities * vault-acl change unseal key and root token in command output sample * remove lines from sample output * fix front matter * move nomad pack tutorials to tools * search/replace /nomad/tutorials links * update acl overview with content from deleted architecture/acl * fix spelling mistake * linkcheck - fix broken links * fix link to Nomad variables tutorial * fix link to Prometheus tutorial * move who uses Nomad to use cases page; move spec/config shortcuts add dividers * Move Consul out of Integrations; move namespaces to govern * move integrations/vault to secure/vault; delete integrations * move ref arch to docs; rename Deploy Nomad back to Install Nomad * address feedback * linkcheck fixes * Fixed raw_exec redirect * add info from /nomad/tutorials/manage-jobs/jobs * update page content with newer tutorial * link updates for architecture sub-folders * Add redirects for removed section index pages. Fix links. * fix broken links from linkcheck * Revert to use dev-portal main branch instead of nomadIA branch * build workaround: add intro-nav-data.json with single entry * fix content-check error * add intro directory to get around Vercel build error * workound for emtpry directory * remove mdx from /intro/ to fix content-check and git snafu * Add intro index.mdx so Vercel build should work --------- Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
170 lines
6.4 KiB
Plaintext
170 lines
6.4 KiB
Plaintext
---
|
||
layout: docs
|
||
page_title: resources block in the job specification
|
||
description: |-
|
||
The "resources" block describes the requirements a task needs to execute.
|
||
Resource requirements include memory, cpu, and more.
|
||
---
|
||
|
||
# `resources` block in the job specification
|
||
|
||
<Placement groups={['job', 'group', 'task', 'resources']} />
|
||
|
||
The `resources` block describes the requirements a task needs to execute.
|
||
Resource requirements include memory, CPU, and more.
|
||
|
||
```hcl
|
||
job "docs" {
|
||
group "example" {
|
||
task "server" {
|
||
resources {
|
||
cpu = 100
|
||
memory = 256
|
||
|
||
device "nvidia/gpu" {
|
||
count = 2
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
## Parameters
|
||
|
||
- `cpu` `(int: 100)` - Specifies the CPU required to run this task in MHz.
|
||
|
||
- `cores` <code>(`int`: <optional>)</code> - Specifies the number of CPU cores
|
||
to reserve specifically for the task. This may not be used with `cpu`. The behavior
|
||
of setting `cores` is specific to each task driver (e.g. [docker][docker_cpu], [exec][exec_cpu]).
|
||
|
||
- `memory` `(int: 300)` - Specifies the memory required in MB.
|
||
|
||
- `memory_max` <code>(`int`: <optional>)</code> - Optionally, specifies the
|
||
maximum memory the task may use, if the client has excess memory capacity, in MB.
|
||
See [Memory Oversubscription](#memory-oversubscription) for more details.
|
||
|
||
- `numa` <code>([Numa][]: <optional>)</code> - Specifies the
|
||
NUMA scheduling preference for the task. Requires the use of `cores`.
|
||
|
||
- `device` <code>([Device][]: <optional>)</code> - Specifies the device
|
||
requirements. This may be repeated to request multiple device types.
|
||
|
||
- `secrets` <code>(`int`: <optional>)</code> - Specifies the size of the
|
||
[`secrets/`][] directory in MB, on platforms where the directory is a
|
||
tmpfs. If set, the scheduler adds the `secrets` value to the `memory` value
|
||
when allocating resources on a client, and this value will be included in the
|
||
allocated resources shown by the `nomad alloc status` and `nomad node status`
|
||
commands. If unset, the client will allocate 1 MB of tmpfs space and it will
|
||
not be counted for scheduling purposes or included in allocated resources. You
|
||
should not set this value if the workload will be placed on a platform where
|
||
tmpfs is unsupported, because it will still be counted for scheduling
|
||
purposes.
|
||
|
||
## Examples
|
||
|
||
The following examples only show the `resources` blocks. Remember that the
|
||
`resources` block is only valid in the placements listed above.
|
||
|
||
### Cores
|
||
|
||
This example specifies that the task requires 2 reserved cores. With this block,
|
||
Nomad finds a client with enough spare capacity to reserve 2 cores exclusively
|
||
for the task. Unlike the `cpu` field, the task does not share CPU time with any
|
||
other tasks managed by Nomad on the client.
|
||
|
||
```hcl
|
||
resources {
|
||
cores = 2
|
||
}
|
||
```
|
||
|
||
If `cores` and `cpu` are both defined in the same resource block, validation of
|
||
the job fails.
|
||
|
||
Refer to [How Nomad Uses CPU][concepts-cpu] for more details on Nomad's
|
||
reservation of CPU resources.
|
||
|
||
|
||
### Memory
|
||
|
||
This example specifies the task requires 2 GB of RAM to operate. 2 GB is the
|
||
equivalent of 2048 MB:
|
||
|
||
```hcl
|
||
resources {
|
||
memory = 2048
|
||
}
|
||
```
|
||
|
||
### Devices
|
||
|
||
This example shows a device constraints as specified in the [device][] block
|
||
which require two nvidia GPUs to be made available:
|
||
|
||
```hcl
|
||
resources {
|
||
device "nvidia/gpu" {
|
||
count = 2
|
||
}
|
||
}
|
||
```
|
||
## Memory oversubscription
|
||
|
||
Setting task memory limits requires balancing the risk of interrupting tasks
|
||
against the risk of wasting resources. If a task memory limit is set too low,
|
||
the task may exceed the limit and be interrupted; if the task memory is too
|
||
high, the cluster is left underutilized.
|
||
|
||
To help maximize cluster memory utilization while allowing a safety margin for
|
||
unexpected load spikes, Nomad allows job authors to set two separate memory
|
||
limits:
|
||
|
||
* `memory`: the reserve limit to represent the task’s typical memory usage —
|
||
this number is used by the Nomad scheduler to reserve and place the task
|
||
|
||
* `memory_max`: the maximum memory the task may use, if the client has excess
|
||
available memory, and may be terminated if it exceeds
|
||
|
||
If a client's memory becomes contended or low, the operating system will
|
||
pressure the running tasks to free up memory. If the contention persists, Nomad
|
||
may kill oversubscribed tasks and reschedule them to other clients. The exact
|
||
mechanism for memory pressure is specific to the task driver, operating system,
|
||
and application runtime.
|
||
|
||
The `memory_max` limit attribute is currently supported by the official
|
||
`raw_exec`, `exec2`, `exec`, `docker`, `podman`, and `java` task drivers. Consult the
|
||
documentation of community-supported task drivers for their memory
|
||
oversubscription support.
|
||
|
||
Memory oversubscription is opt-in. Nomad operators can enable [Memory
|
||
Oversubscription in the scheduler configuration][api_sched_config]. Enterprise
|
||
customers can use [Resource Quotas][quota_spec] to limit the memory
|
||
oversubscription and enable or disable memory oversubscription per [node
|
||
pool][np_sched_config].
|
||
|
||
To avoid degrading the cluster experience, we recommend examining and monitoring
|
||
resource utilization and considering the following suggestions:
|
||
|
||
* Set `oom_score_adj` for Linux host services that aren't managed by Nomad, e.g.
|
||
Docker, logging services, and the Nomad agent itself. For Systemd services, you can use the [`OOMScoreAdj` field](https://github.com/hashicorp/nomad/blob/v1.0.0/dist/systemd/nomad.service#L25).
|
||
|
||
* Monitor hosts for memory utilization and set alerts on Out-Of-Memory errors
|
||
|
||
* Set the [client `reserved`](/nomad/docs/configuration/client#reserved) with enough
|
||
memory for host services that aren't managed by Nomad as well as a buffer
|
||
for the memory excess. For example, if the client reserved memory is 1GB,
|
||
the allocations on the host may exceed their soft memory limit by almost
|
||
1GB in aggregate before the memory becomes contended and allocations get
|
||
killed.
|
||
|
||
[api_sched_config]: /nomad/api-docs/operator/scheduler#update-scheduler-configuration
|
||
[device]: /nomad/docs/job-specification/device 'Nomad device Job Specification'
|
||
[docker_cpu]: /nomad/docs/deploy/task-driver/docker#cpu
|
||
[exec_cpu]: /nomad/docs/deploy/task-driver/exec#cpu
|
||
[np_sched_config]: /nomad/docs/other-specifications/node-pool#memory_oversubscription_enabled
|
||
[quota_spec]: /nomad/docs/other-specifications/quota
|
||
[numa]: /nomad/docs/job-specification/numa 'Nomad NUMA Job Specification'
|
||
[`secrets/`]: /nomad/docs/reference/runtime-environment-settings#secrets
|
||
[concepts-cpu]: /nomad/docs/architecture/cpu
|