Files
nomad/website/content/plugins/author/task-driver.mdx
Aimee Ukasick d305f32017 Docs: Plugin authoring guide (#26395)
* create plugin author guide; remove concepts/plugins

* style guide; update links

* update cni redirect

* move host-volume plugin to /plugins/. Add arch host volume content.

* Apply Jeff's style guide updates

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* Create Base plugin API section, link to BasePlugin interface

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-08-08 14:55:58 -05:00

259 lines
12 KiB
Plaintext

---
layout: docs
page_title: Create task driver plugin for Nomad
description: Learn how to create a Nomad task driver plugin to extend Nomad's workload execution functionality.
---
# Create a task driver plugin
This page provides conceptual information for creating a task driver plugin to
extend Nomad's workload execution functionality.
Task drivers in Nomad are the runtime components that execute workloads. For a
real world example of a Nomad task driver plugin implementation, refer to the
[exec2 driver][].
## Authoring a task driver plugin
Authoring a task driver consists of implementing the [BasePlugin][base-plugin]
and [DriverPlugin][driverplugin] interfaces and adding a main package to launch
the plugin.
The [nomad-skeleton-driver-plugin project][skeletonproject] exists to help
bootstrap the development of new task driver plugins. It provides most of the
boilerplate necessary for a task driver plugin, along with detailed comments.
### Lifecycle and state
A task driver plugin is long-lived and its lifetime is not bound to the Nomad
client. This means that the Nomad client can restart without restarting the
driver. Nomad ensures that one instance of the task driver is running. If
the task driver crashes or otherwise terminates, Nomad launches another instance
of it.
Task drivers should maintain as little state as possible. State for a task is
stored by the Nomad client on task creation. This enables a pattern where the
task driver can maintain an in-memory state of the running tasks, and if
necessary the Nomad client can recover tasks into the task driver state.
## Base plugin API
@include 'plugins/base.mdx'
## Task driver plugin API
### `TaskConfigSchema() (*hclspec.Spec, error)`
This function returns the schema for the task driver configuration of the task.
For more information on `hclspec.Spec`, refer to the [HCL specifications
section](#hcl-specifications).
### `Capabilities() (*Capabilities, error)`
Capabilities define what features the task driver implements.
```go
type Capabilities struct {
// SendSignals marks the task driver as being able to send signals
SendSignals bool
// Exec marks the task driver as being able to execute arbitrary commands
// such as health checks. Used by the ScriptExecutor interface.
Exec bool
//FSIsolation indicates what kind of filesystem isolation the task driver supports.
FSIsolation fsisolation.Mode
//NetIsolationModes lists the set of isolation modes supported by the task driver
NetIsolationModes []NetIsolationMode
// MustInitiateNetwork tells Nomad that the task driver must create the network
// namespace and that the CreateNetwork and DestroyNetwork RPCs are implemented.
MustInitiateNetwork bool
// MountConfigs tells Nomad which mounting config options the task driver supports.
MountConfigs MountConfigSupport
// DisableLogCollection indicates this driver has disabled log collection
// and the client should not start a logmon process.
DisableLogCollection bool
// DynamicWorkloadUsers indicates this driver is capable (but not required)
// of making use of a UID/GID not backed by a user known to the operating
// system. The allocation of a unique, not-in-use UID/GID is managed by the
// Nomad client ensuring no overlap.
DynamicWorkloadUsers bool
}
```
The file system isolation options are the following:
- `fsisolation.Image`: The task driver isolates tasks as machine images.
- `fsisolation.Chroot`: The task driver isolates tasks with `chroot` or
`pivot_root`.
- `fsisolation.Unveil`: The task driver isolates tasks with the
[Landlock LSM][landlock] or other [`unveil`][unveil] like system.
- `fsisolation.None`: The task driver has no filesystem isolation.
The network isolation modes are the following:
- `NetIsolationModeHost`: The task driver supports disabling network isolation
and using the host network.
- `NetIsolationModeGroup`: The task driver supports using the task group
network namespace.
- `NetIsolationModeTask`: The task driver supports isolating the network to
just the task.
- `NetIsolationModeNone`: There is no network to isolate. This is used for
task that the client manages remotely.
### `Fingerprint(context.Context) (<-chan *Fingerprint, error)`
This function is called by the client when the plugin is started. It allows the
driver to indicate its health to the client. The channel returned should
immediately send an initial Fingerprint, then send periodic updates at an
interval that is appropriate for the task driver until the context is canceled.
The fingerprint consists of a `HealthState` and `HealthDescription` to inform
the client about its health. Additionally an `Attributes` field is available for
the task driver to add additional attributes to the client node. The fingerprint
`HealthState` can be one of the following states:
- `HealthStateUndetected`: Indicates that the necessary dependencies for the
driver are not detected on the system. Ex. java runtime for the java driver
- `HealthStateUnhealthy`: Indicates that something is wrong with the task driver
runtime. Ex. docker daemon stopped for the Docker driver
- `HealthStateHealthy`: All systems go
### `StartTask(*TaskConfig) (*TaskHandle, *DriverNetwork, error)`
This function takes a [`TaskConfig`][taskconfig] that includes all of the
configuration needed to launch the task. Additionally, the task driver
configuration can be decoded from the `TaskConfig` by calling
`*TaskConfig.DecodeDriverConfig(t interface{})`, passing in a pointer to the
task driver specific configuration struct. The `TaskConfig` includes an `ID`
field which future operations on the task are referenced by.
Drivers return a [`*TaskHandle`][taskhandle] that contains the required
information for the task driver to reattach to the running task in the case of
plugin crashes or restarts. Some of this required state is specific to the
task driver implementation, thus a `DriverState` field exists to allow the task
driver to encode custom state into the struct. Helper fields exist on the
`TaskHandle` to `GetDriverState` and `SetDriverState` removing the need for the
task driver to handle serialization.
A `*DriverNetwork` can optionally be returned to describe the network of the
task if it is modified by the task driver. An example of this is in the Docker
driver where tasks can be attached to a specific Docker network.
If an error occurs, it is expected that the task driver cleans up any created
resources prior to returning the error.
#### Logging
Nomad handles all rotation and plumbing of task logs. In order for task stdout
and stderr to be received by Nomad, they must be written to the correct
location. Prior to starting the task through the task driver, the Nomad client
creates FIFOs for stdout and stderr. These paths are given to the task driver in
the `TaskConfig`. Use the [`fifo` package][fifopackage] to support cross
platform writing to these paths.
#### Dynamic workload users
Nomad is capable of dynamically allocating unused UID/GID values for use by
task drivers when launching a task. These UID/GID values are deallocated when
the task is destroyed. The pool of available UID/GID values can be controlled
in client config via the [users][users] block.
#### TaskHandle schema versioning
A `Version` field is available on the TaskHandle struct to facilitate backwards
compatible recovery of tasks. This field is opaque to Nomad, but it allows the
driver to handle recover tasks that were created by an older version of the
plugin.
### `RecoverTask(*TaskHandle) error`
When a driver is restarted, it is not expected to persist any internal state to
disk. To support this, Nomad attempts to recover a task that was previously
started if the task driver does not recognize the task ID. During task recovery,
Nomad calls `RecoverTask`, passing the `TaskHandle` that was returned by the
`StartTask` function. If no error is returned, it is expected that the task
driver can now operate on the task by referencing the task ID. If an error
occurs, the Nomad client marks the task as `lost`.
### `WaitTask(context.Context, id string) (<-chan *ExitResult, error)`
The `WaitTask` function is expected to return a channel that sends an
`*ExitResult` when the task exits or close the channel when the context is
canceled. It is also expected that calling `WaitTask` on an exited task
immediately sends an `*ExitResult` on the returned channel.
### `StopTask(taskID string, timeout time.Duration, signal string) error`
The `StopTask` function is expected to stop a running task by sending the given
signal to it. If the task does not stop during the given timeout, the task
driver must forcefully kill the task.
`StopTask` does not clean up resources of the task or remove it from the
driver's internal state. A call to `WaitTask` after `StopTask` is valid and
should be handled.
### `DestroyTask(taskID string, force bool) error`
The `DestroyTask` function cleans up and removes a task that has terminated. If
force is set to true, the task driver must destroy the task even if it is still
running. If `WaitTask` is called after `DestroyTask`, it should return
`drivers.ErrTaskNotFound` as no task state should exist after `DestroyTask` is
called.
### `InspectTask(taskID string) (*TaskStatus, error)`
The `InspectTask` function returns detailed status information for the
referenced `taskID`.
### `TaskStats(context.Context, id string, time.Duration) (<-chan *cstructs.TaskResourceUsage, error)`
The `TaskStats` function returns a channel which the task driver should send
stats to at the given interval. The driver must send stats at the given interval
until the given context is canceled or the task terminates.
### `TaskEvents(context.Context) (<-chan *TaskEvent, error)`
The Nomad client publishes events associated with an allocation. The
`TaskEvents` function allows the task driver to publish driver specific events
about tasks and the Nomad client associates them with the correct
allocation.
An `Eventer` utility, available in the
`github.com/hashicorp/nomad/drivers/shared/eventer` package, implements an event
loop and publishing mechanism for use in the `TaskEvents` function.
### `SignalTask(taskID string, signal string) error`
> Optional - can be skipped by embedding `drivers.DriverSignalTaskNotSupported`
The `SignalTask` function is used by drivers that support sending OS signals
(`SIGHUP`, `SIGKILL`, `SIGUSR1` etc.) to the task. It is an optional function
and is listed as a capability in the task driver `Capabilities` struct.
### `ExecTask(taskID string, cmd []string, timeout time.Duration) (*ExecTaskResult, error)`
> Optional - can be skipped by embedding `drivers.DriverExecTaskNotSupported`
The `ExecTask` function is used by the Nomad client to execute commands inside
the task execution context. For example, the Docker driver executes commands
inside the running container. `ExecTask` is called for Consul script checks.
@include 'plugins/hcl-specifications.mdx'
[exec2 driver]: https://github.com/hashicorp/nomad-driver-exec2
[base-plugin]: https://github.com/hashicorp/nomad/blob/main/plugins/base/base.go#L17
[driverplugin]: https://github.com/hashicorp/nomad/blob/main/plugins/drivers/driver.go#L51
[skeletonproject]: https://github.com/hashicorp/nomad-skeleton-driver-plugin
[taskconfig]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskConfig
[taskhandle]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskHandle
[fifopackage]: https://godoc.org/github.com/hashicorp/nomad/client/lib/fifo
[landlock]: https://docs.kernel.org/userspace-api/landlock.html
[unveil]: https://man.openbsd.org/unveil
[users]: /nomad/docs/configuration/client#users-block