diff --git a/website/source/docs/devices/community.html.md b/website/source/docs/devices/community.html.md new file mode 100644 index 000000000..514cc2878 --- /dev/null +++ b/website/source/docs/devices/community.html.md @@ -0,0 +1,16 @@ +--- +layout: "docs" +page_title: "Drivers: Custom" +sidebar_current: "docs-devices-community" +description: |- + Create custom task drivers for Nomad. +--- + +# Custom Drivers + +Nomad does not currently support pluggable task drivers, however the +interface that a task driver must implement is minimal. In the short term, +custom drivers can be implemented in Go and compiled into the binary, +however in the long term we plan to expose a plugin interface such that +task drivers can be dynamically registered without recompiling the Nomad binary. + diff --git a/website/source/docs/devices/index.html.md b/website/source/docs/devices/index.html.md new file mode 100644 index 000000000..9d2831c33 --- /dev/null +++ b/website/source/docs/devices/index.html.md @@ -0,0 +1,24 @@ +--- +layout: "docs" +page_title: "Device Plugins" +sidebar_current: "docs-devices" +description: |- + Device Plugins are used to expose devices to tasks in Nomad. +--- + +# Device Plugins + +Task drivers are used by Nomad clients to execute a task and provide resource +isolation. By having extensible task drivers, Nomad has the flexibility to +support a broad set of workloads across all major operating systems. + +The list of supported task drivers is provided on the left of this page. +Each task driver documents the configuration available in a +[job specification](/docs/job-specification/index.html), the environments it +can be used in, and the resource isolation mechanisms available. + +Nomad strives to mask the details of running a task from users and instead +provides a clean abstraction. It is possible for the same task to be executed +with different isolation levels depending on the client running the task. +The goal is to use the strictest isolation available and gracefully degrade +protections where necessary. diff --git a/website/source/docs/devices/nvidia.html.md b/website/source/docs/devices/nvidia.html.md new file mode 100644 index 000000000..b8187c9eb --- /dev/null +++ b/website/source/docs/devices/nvidia.html.md @@ -0,0 +1,118 @@ +--- +layout: "docs" +page_title: "Drivers: Raw Exec" +sidebar_current: "docs-devices-nvidia" +description: |- + The Raw Exec task driver simply fork/execs and provides no isolation. +--- + +# Raw Fork/Exec Driver + +Name: `raw_exec` + +The `raw_exec` driver is used to execute a command for a task without any +isolation. Further, the task is started as the same user as the Nomad process. +As such, it should be used with extreme care and is disabled by default. + +## Task Configuration + +```hcl +task "webservice" { + driver = "raw_exec" + + config { + command = "my-binary" + args = ["-flag", "1"] + } +} +``` + +The `raw_exec` driver supports the following configuration in the job spec: + +* `command` - The command to execute. Must be provided. If executing a binary + that exists on the host, the path must be absolute. If executing a binary that + is downloaded from an [`artifact`](/docs/job-specification/artifact.html), the + path can be relative from the allocations's root directory. + +* `args` - (Optional) A list of arguments to the `command`. References + to environment variables or any [interpretable Nomad + variables](/docs/runtime/interpolation.html) will be interpreted before + launching the task. + +## Examples + +To run a binary present on the Node: + +``` +task "example" { + driver = "raw_exec" + + config { + # When running a binary that exists on the host, the path must be absolute/ + command = "/bin/sleep" + args = ["1"] + } +} +``` + +To execute a binary downloaded from an [`artifact`](/docs/job-specification/artifact.html): + +``` +task "example" { + driver = "raw_exec" + + config { + command = "name-of-my-binary" + } + + artifact { + source = "https://internal.file.server/name-of-my-binary" + options { + checksum = "sha256:abd123445ds4555555555" + } + } +} +``` + +## Client Requirements + +The `raw_exec` driver can run on all supported operating systems. For security +reasons, it is disabled by default. To enable raw exec, the Nomad client +configuration must explicitly enable the `raw_exec` driver in the client's +[options](/docs/configuration/client.html#options): + +``` +client { + options = { + "driver.raw_exec.enable" = "1" + } +} +``` + +## Client Options + +* `driver.raw_exec.enable` - Specifies whether the driver should be enabled or + disabled. + +* `driver.raw_exec.no_cgroups` - Specifies whether the driver should not use + cgroups to manage the process group launched by the driver. By default, + cgroups are used to manage the process tree to ensure full cleanup of all + processes started by the task. The driver only uses cgroups when Nomad is + launched as root, on Linux and when cgroups are detected. + +## Client Attributes + +The `raw_exec` driver will set the following client attributes: + +* `driver.raw_exec` - This will be set to "1", indicating the driver is available. + +## Resource Isolation + +The `raw_exec` driver provides no isolation. + +If the launched process creates a new process group, it is possible that Nomad +will leak processes on shutdown unless the application forwards signals +properly. Nomad will not leak any processes if cgroups are being used to manage +the process tree. Cgroups are used on Linux when Nomad is being run with +appropriate priviledges, the cgroup system is mounted and the operator hasn't +disabled cgroups for the driver. diff --git a/website/source/docs/job-specification/device.html.md b/website/source/docs/job-specification/device.html.md new file mode 100644 index 000000000..783edc6fd --- /dev/null +++ b/website/source/docs/job-specification/device.html.md @@ -0,0 +1,255 @@ +--- +layout: "docs" +page_title: "device Stanza - Job Specification" +sidebar_current: "docs-job-specification-device" +description: |- + The "device" stanza is used to require a certain device be made available + to the task. +--- + +# `device` Stanza + +
| Placement | +
+ job -> group -> task -> resources -> **device**
+ |
+
|---|
([Constraint][]: nil) - Constraints to restrict
+ which devices are eligible. This can be provided multiple times to define
+ additional constraints. See below for available attributes.
+
+- `affinity` ([Affinity][]: nil) - Affinity to specify a preference
+ for which devices get selected. This can be provided multiple times to define
+ additional affinities. See below for available attributes.
+
+## `device` Constraint and Affinity Attributes
+
+The set of attributes available for use in a `constraint` or `affinity` are as
+follows:
+
+| Variable | +Description | +Example Value | +
|---|---|---|
| ${device.type} | +The type of device | +"gpu", "tpu", "fpga" | +
| ${device.vendor} | +The device's vendor | +"amd", "nvidia", "intel" | +
| ${device.model} | +The device's model | +"1080ti" | +
| ${device.attr.<property>} | +Property of the device | +${device.attr.memory} => 8 GiB | +
| Base Unit | +Values | +
|---|---|
| Byte | +**Base 2**: KiB, MiB, GiB, TiB, PiB, EiB **Base 10**: kB, KB (equivalent to kB), MB, GB, TB, PB, EB + |
| Byte Rates | +**Base 2**: KiB/s, MiB/s, GiB/s, TiB/s, PiB/s, EiB/s **Base 10**: kB/s, KB/s (equivalent to kB/s), MB/s, GB/s, TB/s, PB/s, EB/s + |
| Hertz | +MHz, GHz | +
| Watts | +mW, W, kW, MW, GW | +