mirror of
https://github.com/kemko/nomad.git
synced 2026-01-03 00:45:43 +03:00
* create plugin author guide; remove concepts/plugins * style guide; update links * update cni redirect * move host-volume plugin to /plugins/. Add arch host volume content. * Apply Jeff's style guide updates Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Create Base plugin API section, link to BasePlugin interface --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
259 lines
12 KiB
Plaintext
259 lines
12 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Create task driver plugin for Nomad
|
|
description: Learn how to create a Nomad task driver plugin to extend Nomad's workload execution functionality.
|
|
---
|
|
|
|
# Create a task driver plugin
|
|
|
|
This page provides conceptual information for creating a task driver plugin to
|
|
extend Nomad's workload execution functionality.
|
|
|
|
Task drivers in Nomad are the runtime components that execute workloads. For a
|
|
real world example of a Nomad task driver plugin implementation, refer to the
|
|
[exec2 driver][].
|
|
|
|
## Authoring a task driver plugin
|
|
|
|
Authoring a task driver consists of implementing the [BasePlugin][base-plugin]
|
|
and [DriverPlugin][driverplugin] interfaces and adding a main package to launch
|
|
the plugin.
|
|
|
|
The [nomad-skeleton-driver-plugin project][skeletonproject] exists to help
|
|
bootstrap the development of new task driver plugins. It provides most of the
|
|
boilerplate necessary for a task driver plugin, along with detailed comments.
|
|
|
|
### Lifecycle and state
|
|
|
|
A task driver plugin is long-lived and its lifetime is not bound to the Nomad
|
|
client. This means that the Nomad client can restart without restarting the
|
|
driver. Nomad ensures that one instance of the task driver is running. If
|
|
the task driver crashes or otherwise terminates, Nomad launches another instance
|
|
of it.
|
|
|
|
Task drivers should maintain as little state as possible. State for a task is
|
|
stored by the Nomad client on task creation. This enables a pattern where the
|
|
task driver can maintain an in-memory state of the running tasks, and if
|
|
necessary the Nomad client can recover tasks into the task driver state.
|
|
|
|
## Base plugin API
|
|
|
|
@include 'plugins/base.mdx'
|
|
|
|
## Task driver plugin API
|
|
|
|
### `TaskConfigSchema() (*hclspec.Spec, error)`
|
|
|
|
This function returns the schema for the task driver configuration of the task.
|
|
For more information on `hclspec.Spec`, refer to the [HCL specifications
|
|
section](#hcl-specifications).
|
|
|
|
### `Capabilities() (*Capabilities, error)`
|
|
|
|
Capabilities define what features the task driver implements.
|
|
|
|
```go
|
|
type Capabilities struct {
|
|
// SendSignals marks the task driver as being able to send signals
|
|
SendSignals bool
|
|
|
|
// Exec marks the task driver as being able to execute arbitrary commands
|
|
// such as health checks. Used by the ScriptExecutor interface.
|
|
Exec bool
|
|
|
|
//FSIsolation indicates what kind of filesystem isolation the task driver supports.
|
|
FSIsolation fsisolation.Mode
|
|
|
|
//NetIsolationModes lists the set of isolation modes supported by the task driver
|
|
NetIsolationModes []NetIsolationMode
|
|
|
|
// MustInitiateNetwork tells Nomad that the task driver must create the network
|
|
// namespace and that the CreateNetwork and DestroyNetwork RPCs are implemented.
|
|
MustInitiateNetwork bool
|
|
|
|
// MountConfigs tells Nomad which mounting config options the task driver supports.
|
|
MountConfigs MountConfigSupport
|
|
|
|
// DisableLogCollection indicates this driver has disabled log collection
|
|
// and the client should not start a logmon process.
|
|
DisableLogCollection bool
|
|
|
|
// DynamicWorkloadUsers indicates this driver is capable (but not required)
|
|
// of making use of a UID/GID not backed by a user known to the operating
|
|
// system. The allocation of a unique, not-in-use UID/GID is managed by the
|
|
// Nomad client ensuring no overlap.
|
|
DynamicWorkloadUsers bool
|
|
}
|
|
```
|
|
|
|
The file system isolation options are the following:
|
|
|
|
- `fsisolation.Image`: The task driver isolates tasks as machine images.
|
|
- `fsisolation.Chroot`: The task driver isolates tasks with `chroot` or
|
|
`pivot_root`.
|
|
- `fsisolation.Unveil`: The task driver isolates tasks with the
|
|
[Landlock LSM][landlock] or other [`unveil`][unveil] like system.
|
|
- `fsisolation.None`: The task driver has no filesystem isolation.
|
|
|
|
The network isolation modes are the following:
|
|
|
|
- `NetIsolationModeHost`: The task driver supports disabling network isolation
|
|
and using the host network.
|
|
- `NetIsolationModeGroup`: The task driver supports using the task group
|
|
network namespace.
|
|
- `NetIsolationModeTask`: The task driver supports isolating the network to
|
|
just the task.
|
|
- `NetIsolationModeNone`: There is no network to isolate. This is used for
|
|
task that the client manages remotely.
|
|
|
|
### `Fingerprint(context.Context) (<-chan *Fingerprint, error)`
|
|
|
|
This function is called by the client when the plugin is started. It allows the
|
|
driver to indicate its health to the client. The channel returned should
|
|
immediately send an initial Fingerprint, then send periodic updates at an
|
|
interval that is appropriate for the task driver until the context is canceled.
|
|
|
|
The fingerprint consists of a `HealthState` and `HealthDescription` to inform
|
|
the client about its health. Additionally an `Attributes` field is available for
|
|
the task driver to add additional attributes to the client node. The fingerprint
|
|
`HealthState` can be one of the following states:
|
|
|
|
- `HealthStateUndetected`: Indicates that the necessary dependencies for the
|
|
driver are not detected on the system. Ex. java runtime for the java driver
|
|
- `HealthStateUnhealthy`: Indicates that something is wrong with the task driver
|
|
runtime. Ex. docker daemon stopped for the Docker driver
|
|
- `HealthStateHealthy`: All systems go
|
|
|
|
### `StartTask(*TaskConfig) (*TaskHandle, *DriverNetwork, error)`
|
|
|
|
This function takes a [`TaskConfig`][taskconfig] that includes all of the
|
|
configuration needed to launch the task. Additionally, the task driver
|
|
configuration can be decoded from the `TaskConfig` by calling
|
|
`*TaskConfig.DecodeDriverConfig(t interface{})`, passing in a pointer to the
|
|
task driver specific configuration struct. The `TaskConfig` includes an `ID`
|
|
field which future operations on the task are referenced by.
|
|
|
|
Drivers return a [`*TaskHandle`][taskhandle] that contains the required
|
|
information for the task driver to reattach to the running task in the case of
|
|
plugin crashes or restarts. Some of this required state is specific to the
|
|
task driver implementation, thus a `DriverState` field exists to allow the task
|
|
driver to encode custom state into the struct. Helper fields exist on the
|
|
`TaskHandle` to `GetDriverState` and `SetDriverState` removing the need for the
|
|
task driver to handle serialization.
|
|
|
|
A `*DriverNetwork` can optionally be returned to describe the network of the
|
|
task if it is modified by the task driver. An example of this is in the Docker
|
|
driver where tasks can be attached to a specific Docker network.
|
|
|
|
If an error occurs, it is expected that the task driver cleans up any created
|
|
resources prior to returning the error.
|
|
|
|
#### Logging
|
|
|
|
Nomad handles all rotation and plumbing of task logs. In order for task stdout
|
|
and stderr to be received by Nomad, they must be written to the correct
|
|
location. Prior to starting the task through the task driver, the Nomad client
|
|
creates FIFOs for stdout and stderr. These paths are given to the task driver in
|
|
the `TaskConfig`. Use the [`fifo` package][fifopackage] to support cross
|
|
platform writing to these paths.
|
|
|
|
#### Dynamic workload users
|
|
|
|
Nomad is capable of dynamically allocating unused UID/GID values for use by
|
|
task drivers when launching a task. These UID/GID values are deallocated when
|
|
the task is destroyed. The pool of available UID/GID values can be controlled
|
|
in client config via the [users][users] block.
|
|
|
|
#### TaskHandle schema versioning
|
|
|
|
A `Version` field is available on the TaskHandle struct to facilitate backwards
|
|
compatible recovery of tasks. This field is opaque to Nomad, but it allows the
|
|
driver to handle recover tasks that were created by an older version of the
|
|
plugin.
|
|
|
|
### `RecoverTask(*TaskHandle) error`
|
|
|
|
When a driver is restarted, it is not expected to persist any internal state to
|
|
disk. To support this, Nomad attempts to recover a task that was previously
|
|
started if the task driver does not recognize the task ID. During task recovery,
|
|
Nomad calls `RecoverTask`, passing the `TaskHandle` that was returned by the
|
|
`StartTask` function. If no error is returned, it is expected that the task
|
|
driver can now operate on the task by referencing the task ID. If an error
|
|
occurs, the Nomad client marks the task as `lost`.
|
|
|
|
### `WaitTask(context.Context, id string) (<-chan *ExitResult, error)`
|
|
|
|
The `WaitTask` function is expected to return a channel that sends an
|
|
`*ExitResult` when the task exits or close the channel when the context is
|
|
canceled. It is also expected that calling `WaitTask` on an exited task
|
|
immediately sends an `*ExitResult` on the returned channel.
|
|
|
|
### `StopTask(taskID string, timeout time.Duration, signal string) error`
|
|
|
|
The `StopTask` function is expected to stop a running task by sending the given
|
|
signal to it. If the task does not stop during the given timeout, the task
|
|
driver must forcefully kill the task.
|
|
|
|
`StopTask` does not clean up resources of the task or remove it from the
|
|
driver's internal state. A call to `WaitTask` after `StopTask` is valid and
|
|
should be handled.
|
|
|
|
### `DestroyTask(taskID string, force bool) error`
|
|
|
|
The `DestroyTask` function cleans up and removes a task that has terminated. If
|
|
force is set to true, the task driver must destroy the task even if it is still
|
|
running. If `WaitTask` is called after `DestroyTask`, it should return
|
|
`drivers.ErrTaskNotFound` as no task state should exist after `DestroyTask` is
|
|
called.
|
|
|
|
### `InspectTask(taskID string) (*TaskStatus, error)`
|
|
|
|
The `InspectTask` function returns detailed status information for the
|
|
referenced `taskID`.
|
|
|
|
### `TaskStats(context.Context, id string, time.Duration) (<-chan *cstructs.TaskResourceUsage, error)`
|
|
|
|
The `TaskStats` function returns a channel which the task driver should send
|
|
stats to at the given interval. The driver must send stats at the given interval
|
|
until the given context is canceled or the task terminates.
|
|
|
|
### `TaskEvents(context.Context) (<-chan *TaskEvent, error)`
|
|
|
|
The Nomad client publishes events associated with an allocation. The
|
|
`TaskEvents` function allows the task driver to publish driver specific events
|
|
about tasks and the Nomad client associates them with the correct
|
|
allocation.
|
|
|
|
An `Eventer` utility, available in the
|
|
`github.com/hashicorp/nomad/drivers/shared/eventer` package, implements an event
|
|
loop and publishing mechanism for use in the `TaskEvents` function.
|
|
|
|
### `SignalTask(taskID string, signal string) error`
|
|
|
|
> Optional - can be skipped by embedding `drivers.DriverSignalTaskNotSupported`
|
|
|
|
The `SignalTask` function is used by drivers that support sending OS signals
|
|
(`SIGHUP`, `SIGKILL`, `SIGUSR1` etc.) to the task. It is an optional function
|
|
and is listed as a capability in the task driver `Capabilities` struct.
|
|
|
|
### `ExecTask(taskID string, cmd []string, timeout time.Duration) (*ExecTaskResult, error)`
|
|
|
|
> Optional - can be skipped by embedding `drivers.DriverExecTaskNotSupported`
|
|
|
|
The `ExecTask` function is used by the Nomad client to execute commands inside
|
|
the task execution context. For example, the Docker driver executes commands
|
|
inside the running container. `ExecTask` is called for Consul script checks.
|
|
|
|
@include 'plugins/hcl-specifications.mdx'
|
|
|
|
[exec2 driver]: https://github.com/hashicorp/nomad-driver-exec2
|
|
[base-plugin]: https://github.com/hashicorp/nomad/blob/main/plugins/base/base.go#L17
|
|
[driverplugin]: https://github.com/hashicorp/nomad/blob/main/plugins/drivers/driver.go#L51
|
|
[skeletonproject]: https://github.com/hashicorp/nomad-skeleton-driver-plugin
|
|
[taskconfig]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskConfig
|
|
[taskhandle]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskHandle
|
|
[fifopackage]: https://godoc.org/github.com/hashicorp/nomad/client/lib/fifo
|
|
[landlock]: https://docs.kernel.org/userspace-api/landlock.html
|
|
[unveil]: https://man.openbsd.org/unveil
|
|
[users]: /nomad/docs/configuration/client#users-block
|