Files
nomad/website/content/docs/operations/stateful-workloads.mdx
Aimee Ukasick 9778fa4912 Docs: Fix broken links in main for 1.10 release (#25540)
* Docs: Fix broken links in main for 1.10 release

* Implement Tim's suggestions

* Remove link to Portworx from ecosystem page

* remove "Portworx" since Portworx 3.2 no longer supports Nomad
2025-04-01 09:09:44 -05:00

256 lines
22 KiB
Plaintext

---
layout: docs
page_title: Considerations for Stateful Workloads
description: |-
Learn about persistent storage options for stateful workloads on Nomad. Compare AWS, Azure, and Google Cloud Platform (GCP) storage services. Review the advantages and disadvantages of using Container Storage Interface (CSI) volumes, dynamic host volumes, static host volumes, and ephemeral disks.
---
# Considerations for Stateful Workloads
By default, Nomad's allocation storage is ephemeral. Nomad can discard it during
new deployments, when rescheduling jobs, or if it loses a client. This is
undesirable when running persistent workloads such as databases.
This document explores the options for persistent storage of workloads running
in Nomad. The information provided is for practitioners familiar with Nomad and
with a foundational understanding of storage basics.
## Considerations
Consider access patterns, performance, reliability and availability needs, and
maintenance to choose the most appropriate storage strategy.
Local storage is performant and available. If it has enough capacity it does not
need much maintenance. But it is not redundant; if a single node, disk, or
group of disks fails, data loss and service interruption will occur.
A geographically distributed networked storage with multiple redundancies,
including disks, controllers, and network paths, provides higher availability
and resilience, and can tolerate multiple hardware failures before risking data
loss. But the performance and reliability of networked storage depends
on the network. It can have higher latency and lower throughput than local
storage, and may require more maintenance.
Consider whether Nomad is running in the public cloud or on-premises, and what
storage options are available in that environment. From there, the most optimal
choice will depend your organizational and application needs.
### Public cloud
Public cloud providers offer different storage services with various tradeoffs.
Usually they're comprised of local disks, network attached block devices, and
networked shared storage.
### AWS
| AWS service | Availability | Persistence | Performance | Suitability |
|----------------------------------------------------------------------------------------------|-------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| [Instance Storage](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html) | Locally on some instance types | Limited, not persistent across instance stops/terminations or hardware failures | High throughput and low latency | Temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content |
| [Elastic Block Store](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html) | Zonal block devices attached to one or more instances | Persistent, with an independent lifecycle | [Configurable](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html), but higher latency than Instance Store | General purpose persistent storage |
| [Elastic File System](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEFS.html) | Regional/Multi-regional file storage that can be available to multiple instances | Persistent, with an independent lifecycle | [Configurable](https://docs.aws.amazon.com/efs/latest/ug/performance.html), but with less throughput and higher latency than Instance Store or EBS | File storage that needs to be available to multiple instances in multiple zones (even only as a failover) |
### Azure
| Azure service | Availability | Persistence | Performance | Suitability |
|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| [Ephemeral OS disks](https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks) | Locally on some instance types | Limited, not persistent across instance stops/terminations or hardware failures | High throughput and low latency | Temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content |
| [Managed Disks](https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types) | [Zonal or regional](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-redundancy) block devices attached to one or more VMs | Persistent, with an independent lifecycle | [Configurable](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#disk-type-comparison) | General purpose persistent storage |
| [Azure Files](https://docs.microsoft.com/en-us/azure/storage/files/storage-files-introduction) | Zonal/Regional/Multi-regional file storage that can be available to multiple VMs | Persistent, with an independent lifecycle | [Configurable](https://learn.microsoft.com/en-us/azure/storage/files/storage-files-planning#storage-tiers) | File storage that needs to be available to multiple VMs in multiple zones (even only as a failover) |
### GCP
| GCP service | Availability | Persistence | Performance | Suitability |
|--------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|-------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| [Local SSD](https://cloud.google.com/compute/docs/disks/local-ssd) | Locally on some instance types | Limited, not persistent across instance stops/terminations or hardware failures | High throughput and low latency | Temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content |
| [Persistent Disk](https://cloud.google.com/compute/docs/disks) | [Zonal or regional](https://cloud.google.com/compute/docs/disks#repds) block devices attached to one or more instances | Persistent, with an independent lifecycle | [Configurable](https://cloud.google.com/compute/docs/disks/performance) | General purpose persistent storage |
| [Filestore](https://cloud.google.com/filestore) | Zonal/Regional file storage that can be available to multiple instances | Persistent, with an independent lifecycle | [Configurable](https://cloud.google.com/filestore/docs/performance) | File storage that needs to be available to multiple VMs in multiple zones (even only as a failover) |
### Private cloud or on-premises
When running workloads on-premises in a self-managed private cloud, SAN/NAS
systems or Software Defined Storage like Ceph usually provide
non-local storage. Compute instances can access the storage using a block
protocol such as iSCSI, FC, NVMe-oF, or a file protocol such as NFS, CIFS, or
both. Dedicated storage teams manage these systems in most organizations.
## Consuming persistent storage from Nomad
Since environments differ depending on application requirements, consider
performance, reliability, availability, and maintenance when choosing the most
appropriate storage driver.
### CSI
[Container Storage
Interface](https://github.com/container-storage-interface/spec) is a
vendor-neutral specification that allows storage providers to develop plugins
that orchestrators such as Nomad can use. Some CSI plugins can dynamically
provision and manage volume lifecycles, including snapshots, deletion, and
dynamic resizing. The exact feature set each plugin supports will depend on the
plugin and the underlying storage platform.
Find a list of plugins and their feature set in the [Kubernetes CSI Developer
Documentation](https://kubernetes-csi.github.io/docs/drivers.html).
While Nomad follows the CSI specification, some plugins may implement
orchestrator-specific logic that makes them incompatible with Nomad. You should
validate that your chosen plugin works with Nomad before using it. Refer to the
plugin documentation from the storage provider for more information.
There are three CSI plugin subtypes:
- **Controller**: Communicates with the storage provider to manage the volume
lifecycle.
- **Node**: Runs on all Nomad clients and handles all local operations (for
example, mounting/unmounting volumes in allocations). The node must be
`privileged` to perform those operations.
- **Monolithic**: Combines both the above roles.
All types can and should be run as Nomad jobs - `system` jobs for Node and
Monolithic, `service` for Controllers. Refer to the [Container Storage Interface
(CSI) plugins page][csi-concepts] for more information.
CSI plugins are useful when storage requirements are quickly and constantly
evolving. For example, an environment that sees new workloads with persistent
storage added or removed frequently is well suited for CSI. However, they
present some challenges in terms of maintenance - most notably, they need to run
continuously, be configured (including authentication and connectivity to the
storage platform), and updated to keep track with new features and bug fixes and
keep compatibility with the underlying storage platform. They also introduce a
couple of moving parts, can be difficult to troubleshoot, and have a complex
security profile (due to needing to run as `privileged` containers in order to
be able to mount volumes).
The [Stateful Workloads with CSI
tutorial](/nomad/tutorials/stateful-workloads/stateful-workloads-csi-volumes)
and the [Nomad CSI demo
repository](https://github.com/hashicorp/nomad/tree/main/demo/csi) offer
guidance and examples on how to use CSI plugins with Nomad and include job files
for running the plugins and configuration files for creating and consuming
volumes.
### Host volumes
Host volumes mount paths from the host (the Nomad client) into allocations.
Nomad is aware of host volume availability and makes use of it for job
scheduling. However, Nomad does not know about the volume's underlying
characteristics, such as if it is a standard folder on a local ext4 filesystem,
backed by a distributed networked storage such as GlusterFS, or a mounted
NFS/CIFS volume from a NAS or a public cloud service such as AWS EFS. Therefore
you can use host volumes for both local somewhat persistent storage and for
highly persistent networked storage.
Host volumes may be dynamic or static. Provision dynamic host volumes
with the [`volume create`](/nomad/docs/commands/volume/create) command or
API. [ACL policies](/nomad/docs/other-specifications/acl-policy#namespace-rules)
allow delegation of control for storage within a namespace to Nomad
Operators. The dynamic host volume [plugin
specification](/nomad/docs/concepts/plugins/storage/host-volumes) allows you to
develop plugins specific to your local storage environment. For example, in an
on-prem cluster you could write a plugin to perform LVM thin-provisioning.
You declare static host volumes in the Nomad agent's configuration file, and
you must restart the Nomad client to reconfigure them. This makes static host
volumes impractical if you frequently change your storage
configuration. Furthermore, it might require coordination between different
[personas](/nomad/docs/concepts/security#personas) to configure and consume host
volumes. For example, a Nomad Administrator must modify Nomad's configuration
file to add, update, and remove host volumes to make them available for consumption by
Nomad Operators. Or, with networked host volumes, a Storage Administrator
needs to provision the volumes and makes them available to the Nomad clients. A
System Administrator then mounts them on the Nomad clients.
Host volumes backed by local storage help persist data that is not critical or
can be readily restored. For example, an on-disk cache that can be rebuilt, or a
clustered application where a single node can rebuild its state from the rest
of the cluster. When backed by networked storage such as NFS/CIFS-mounted
volumes or distributed storage with GlusterFS or Ceph, host volumes provide a quick
option to consume highly available and reliable storage.
Refer to the [Stateful workloads with Nomad host
volumes][csi-tutorial]
tutorial to learn more about using host volumes with Nomad.
#### NFS caveats
A few caveats with NFS-backed host volumes include ACLs, reliability, and
performance. NFS mount options should be the same on all mounting Nomad clients.
Depending on your NFS version, the UID/GID (user/group IDs) can differ between
the different Nomad clients, leading to issues when an allocation on another
host tries to access the volume. The only way to ensure this isn't an issue is
to use NFS v4 with ID mapping based on Kerberos or to have a reliable
configuration management/image-building process that ensures UID/GIDs
synchronize between hosts. You should use hard mounts to prevent data loss,
optionally with `intr` to enable the option to interrupt NFS requests, which
prevents the whole system from locking up in case of NFS server unavailability.
A significant factor in the performance of NFS-backed storage is the `wsize` and
`rsize` mount options that determine the maximum read/write size of a block.
Smaller sizes mean bigger operations will be split into smaller chunks,
significantly impacting performance. The underlying storage system's vendor
provides the optimal sizes. For example, [AWS
EFS](https://docs.aws.amazon.com/efs/latest/ug/mounting-fs-mount-cmd-general.html)
recommends a value of `1048576` bytes of data for both `wsize` and `rsize`.
To learn more about NFS mount options, visit Red Hat's [NFS
documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/deployment_guide/s1-nfs-client-config-options).
### Ephemeral disks
Nomad [ephemeral disks](/nomad/docs/job-specification/ephemeral_disk), describe
the best-effort persistence of a Nomad allocation's folder. They support data
migrations between hosts (which require network connectivity between the Nomad
client nodes) and are size-aware for scheduling. Since persistence is the best
effort, however, you will lose data if the client or underlying storage fails.
Ephemeral disks are perfect for data that you can rebuild if needed, such as an
in-progress cache or a local copy of data.
## Storage comparison
With the information laid out in this document, use the following table to
choose the storage options that best addresses your Nomad storage requirements.
| Storage option | Advantages | Disadvantages | Ideal for |
|---|---|---|---|
| CSI volumes | <ul><li>Wide ecosystem with many providers</li><li>Advanced features such as snapshots, cloning, and resizing</li><li>Dynamic, flexible, and self-service (anyone with the correct ACL policies can create volumes on-demand)</li></ul> | <ul><li>Some complexity and ongoing maintenance</li><li>Plugin upgrades have to follow the underlying storage provider's API changes/upgrades</li><li>Not all CSI plugins implement all features</li><li>Not all CSI plugins respect the CSI spec and are Nomad compatible</li><li>Node plugins need to run in privileged mode to be able to mount the volumes in allocations</li></ul> | <ul><li>Environments where Nomad cluster operators and consumers need to easily add/change storage, and where the storage provider of choice has a CSI plugin that respects the CSI spec</li></ul> |
| Dynamic host volumes backed by local storage | <ul><li>Readily available</li><li>Fast due to being local</li><li>Doesn't require ongoing maintenance</li></ul> | <ul><li>Not fault tolerant. In case of hardware failure on a single instance, the data will be lost unless the application can restore the data</li></ul> | <ul><li>Environments with high performance and low latency persistent storage requirements where the application is designed to tolerate node failures.</li></ul> |
| Dynamic host volumes backed by networked or clustered storage | <ul><li>Readily available</li><li>Require no ongoing maintenance on the Nomad side (but might on the storage provider) </li></ul> | <ul><li>The underlying networked storage and its limitations are decoupled from the consumer, but need to be understood. For example, is concurrent access possible</li></ul> | <ul><li>Environments that have an existing storage provider that can be consumed via NFS/CIFS.</li></ul> |
| Static host volumes backed by local storage | <ul><li>Readily available</li><li>Fast due to being local</li><li>Doesn't require ongoing maintenance</li></ul> | <ul><li>Requires coordination between multiple personas to configure and consume (operators running the Nomad clients need to configure them statically in the Nomad client's configuration file)</li><li>Not fault tolerant. In case of hardware failure on a single instance, the data will be lost</li></ul> | <ul><li> Environments with low persistent storage requirements that could tolerate some failure but prefer not to or have high performance and low latency needs.</li></ul> |
| Static host volumes backed by networked or clustered storage | <ul><li>Readily available</li><li>Require no ongoing maintenance on the Nomad side (but might on the storage provider) </li></ul> | <ul><li>Require coordination between multiple personas to configure and consume (storage admins need to provision volumes, operators running the Nomad clients need to configure them statically in the Nomad client's configuration file)</li><li>The underlying networked storage and its limitations are decoupled from the consumer, but need to be understood. For example, is concurrent access possible</li></ul> | <ul><li>Environments with low amounts or low frequency of change of storage that have an existing storage provider that can be consumed via NFS/CIFS.</li></ul> |
| Ephemeral disks | <ul><li>Fast due to being local</li><li>Basic best effort persistence, including optional migration across Nomad clients</li></ul> | <ul><li> Not fault tolerant. In case of hardware failure on a single instance, the data will be lost </li></ul> | <ul><li>Environments that need temporary caches, somewhere to store files undergoing processing, etc. Everything which is ephemeral and can be easily rebuilt.</li></ul> |
## Additional resources
To learn more about Nomad and the topics covered in this document, visit the
following resources:
### Allocations
- Monitoring your allocations and their storage with [Nomad's event
stream](/nomad/tutorials/integrate-nomad/event-stream)
- [Best practices for cluster
setup](/well-architected-framework/nomad/production-reference-architecture-vm-with-consul)
### CSI
- [Nomad CSI plugin concepts][csi-concepts]
- [Nomad CSI
tutorial][csi-tutorial]
- [Nomad CSI examples](https://github.com/hashicorp/nomad/tree/main/demo/csi)
- [Ceph RBD CSI with Nomad](https://docs.ceph.com/en/latest/rbd/rbd-nomad/)
- [Democratic CSI with
Nomad](https://github.com/democratic-csi/democratic-csi/blob/master/docs/nomad.md)
- [JuiceFS CSI with Nomad](https://juicefs.com/docs/csi/csi-in-nomad/)
- [Hetzner CSI](https://github.com/hetznercloud/csi-driver/blob/main/docs/nomad/README.md)
- [NFS CSI Plugin](https://gitlab.com/rocketduck/csi-plugin-nfs)
### Dynamic Host Volumes
- [Dynamic host volume plugins](/nomad/docs/concepts/plugins/storage/host-volumes)
- [Dynamic host volume tutorial](/nomad/tutorial/stateful-workloads/stateful-workloads-dynamic-host-volumes)
[csi-concepts]: /nomad/docs/concepts/plugins/storage/csi
[csi-tutorial]: /nomad/tutorials/stateful-workloads/stateful-workloads-csi-volumes