Files
nomad/website/source/docs/internals/high-availability.html.md
2015-09-17 16:33:37 -07:00

2.2 KiB

layout, page_title, sidebar_current, description
layout page_title sidebar_current description
docs High Availability docs-internals-ha Learn about the high availability design of Nomad.

High Availability

Nomad is primarily used in production environments to manage secrets. As a result, any downtime of the Nomad service can affect downstream clients. Nomad is designed to support a highly available deploy to ensure a machine or process failure is minimally disruptive.

~> Advanced Topic! This page covers technical details of Nomad. You don't need to understand these details to effectively use Nomad. The details are documented here for those who wish to learn about them without having to go spelunking through the source code. However, if you're an operator of Nomad, we recommend learning about the architecture due to the importance of Nomad in an environment.

Design Overview

The primary design goal in making Nomad highly availability (HA) was to minimize downtime and not horizontal scalability. Nomad is typically bound by the IO limits of the storage backend rather than the compute requirements. This simplifies the HA approach and allows more complex coordination to be avoided.

Certain storage backends, such as Consul, provide additional coordination functions that enable Nomad to run in an HA configuration. When supported by the backend, Nomad will automatically run in HA mode without additional configuration.

When running in HA mode, Nomad servers have two additional states they can be in: standby and active. For multiple Nomad servers sharing a storage backend, only a single instance will be active at any time while all other instances are hot standbys.

The active server operates in a standard fashion and processes all requests. The standby servers do not process requests, and instead redirect to the active Nomad. Meanwhile, if the active server is sealed, fails, or loses network connectivity then one of the standbys will take over and become the active instance.

It is important to note that only unsealed servers act as a standby. If a server is still in the sealed state, then it cannot act as a standby as it would be unable to serve any requests should the active server fail.