mirror of
https://github.com/kemko/nomad.git
synced 2026-01-06 18:35:44 +03:00
* Move commands from docs to its own root-level directory * temporarily use modified dev-portal branch with nomad ia changes * explicitly clone nomad ia exp branch * retrigger build, fixed dev-portal broken build * architecture, concepts and get started individual pages * fix get started section destinations * reference section * update repo comment in website-build.sh to show branch * docs nav file update capitalization * update capitalization to force deploy * remove nomad-vs-kubernetes dir; move content to what is nomad pg * job section * Nomad operations category, deploy section * operations category, govern section * operations - manage * operations/scale; concepts scheduling fix * networking * monitor * secure section * remote auth-methods folder and move up pages to sso; linkcheck * Fix install2deploy redirects * fix architecture redirects * Job section: Add missing section index pages * Add section index pages so breadcrumbs build correctly * concepts/index fix front matter indentation * move task driver plugin config to new deploy section * Finish adding full URL to tutorials links in nav * change SSO to Authentication in nav and file system * Docs NomadIA: Move tutorials into NomadIA branch (#26132) * Move governance and policy from tutorials to docs * Move tutorials content to job-declare section * run jobs section * stateful workloads * advanced job scheduling * deploy section * manage section * monitor section * secure/acl and secure/authorization * fix example that contains an unseal key in real format * remove images from sso-vault * secure/traffic * secure/workload-identities * vault-acl change unseal key and root token in command output sample * remove lines from sample output * fix front matter * move nomad pack tutorials to tools * search/replace /nomad/tutorials links * update acl overview with content from deleted architecture/acl * fix spelling mistake * linkcheck - fix broken links * fix link to Nomad variables tutorial * fix link to Prometheus tutorial * move who uses Nomad to use cases page; move spec/config shortcuts add dividers * Move Consul out of Integrations; move namespaces to govern * move integrations/vault to secure/vault; delete integrations * move ref arch to docs; rename Deploy Nomad back to Install Nomad * address feedback * linkcheck fixes * Fixed raw_exec redirect * add info from /nomad/tutorials/manage-jobs/jobs * update page content with newer tutorial * link updates for architecture sub-folders * Add redirects for removed section index pages. Fix links. * fix broken links from linkcheck * Revert to use dev-portal main branch instead of nomadIA branch * build workaround: add intro-nav-data.json with single entry * fix content-check error * add intro directory to get around Vercel build error * workound for emtpry directory * remove mdx from /intro/ to fix content-check and git snafu * Add intro index.mdx so Vercel build should work --------- Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
480 lines
16 KiB
Plaintext
480 lines
16 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Enable TLS encryption
|
|
description: |-
|
|
Create mutual TLS (mTLS) certificates and configure Nomad to encrypt API and
|
|
RPC traffic.
|
|
---
|
|
|
|
# Enable TLS encryption
|
|
|
|
Securing Nomad's cluster communication is not only important for security but
|
|
can even ease operations by preventing mistakes and misconfigurations. Nomad
|
|
optionally uses mutual [TLS][tls] (mTLS) for all HTTP and RPC communication.
|
|
Nomad's use of mTLS provides the following properties:
|
|
|
|
- Prevent unauthorized Nomad access
|
|
- Prevent observing or tampering with Nomad communication
|
|
- Prevent client/server role or region misconfigurations
|
|
- Prevent other services from masquerading as Nomad agents
|
|
|
|
Preventing region misconfigurations is a property of Nomad's mTLS not commonly
|
|
found in the TLS implementations on the public Internet. While most uses of
|
|
TLS verify the identity of the server you are connecting to based on a domain
|
|
name such as `example.com`, Nomad verifies the node you are connecting to is in
|
|
the expected region and configured for the expected role (e.g.
|
|
`client.us-west.nomad`). This also prevents other services who may have access
|
|
to certificates signed by the same private CA from masquerading as Nomad
|
|
agents. If certificates were identified based on hostname/IP then any other
|
|
service on a host could masquerade as a Nomad agent.
|
|
|
|
Correctly configuring TLS can be a complex process, especially given the wide
|
|
range of deployment methodologies. If you use the sample
|
|
[Vagrantfile][vagrantfile] from the [Nomad GitHub repository][nomad-repo] - or
|
|
have Nomad installed - this guide will provide you with a production ready
|
|
TLS configuration.
|
|
|
|
~> Note that while Nomad's TLS configuration will be production ready, key
|
|
management and rotation is a complex subject not covered by this guide.
|
|
[Vault][vault] is the suggested solution for key generation and management.
|
|
|
|
## Creating certificates
|
|
|
|
The first step to configuring TLS for Nomad is generating certificates. In
|
|
order to prevent unauthorized cluster access, Nomad requires all certificates
|
|
be signed by the same Certificate Authority (CA). This should be a _private_ CA
|
|
and not a public one like [Let's Encrypt][letsencrypt] as any certificate
|
|
signed by this CA will be allowed to communicate with the cluster.
|
|
|
|
~> Nomad certificates may be signed by intermediate CAs as long as the root CA
|
|
is the same. Append all intermediate CAs to the `cert_file`.
|
|
|
|
### Certificate authority
|
|
|
|
There are a variety of tools for managing your own CA, [like the PKI secret
|
|
backend in Vault][vault-pki], but for the sake of simplicity this guide will
|
|
use the Nomad [`tls ca create`][] command.
|
|
|
|
Generate the CA's private key and certificate.
|
|
|
|
```shell-session
|
|
$ nomad tls ca create
|
|
```
|
|
|
|
The CA key (`nomad-agent-ca-key.pem`) will be used to sign certificates for Nomad
|
|
agents and must be kept private. The CA certificate (`nomad-agent-ca.pem`) contains
|
|
the public key necessary to validate Nomad certificates and therefore must be
|
|
distributed to every node that requires access.
|
|
|
|
### Agent certificates
|
|
|
|
Once you have a CA certificate and key you can generate and sign the
|
|
certificates Nomad will use directly. TLS certificates commonly use the
|
|
fully-qualified domain name of the system being identified as the certificate's
|
|
Common Name (CN). However, hosts (and therefore hostnames and IPs) are often
|
|
ephemeral in Nomad clusters. Not only would signing a new certificate per
|
|
Nomad node be difficult, but using a hostname provides no security or
|
|
functional benefits to Nomad. To fulfill the desired security properties
|
|
(above) Nomad certificates are signed with their region and role such as:
|
|
|
|
- `client.global.nomad` for a client node in the `global` region
|
|
- `server.us-west.nomad` for a server node in the `us-west` region
|
|
|
|
Generate a certificate for the Nomad server.
|
|
|
|
```shell-session
|
|
$ nomad tls cert create -server -region global
|
|
```
|
|
|
|
Generate a certificate for the Nomad client.
|
|
|
|
```shell-session
|
|
$ nomad tls cert create -client
|
|
```
|
|
|
|
Generate a certificate for the CLI.
|
|
|
|
```shell-session
|
|
$ nomad tls cert create -cli
|
|
```
|
|
|
|
Using `localhost` and `127.0.0.1` as subject alternate names (SANs) allows
|
|
tools like `curl` to be able to communicate with Nomad's HTTP API when run on
|
|
the same host. Other SANs may be added including a DNS resolvable hostname to
|
|
allow remote HTTP requests from third party tools.
|
|
|
|
You should now have the following files:
|
|
|
|
- `nomad-agent-ca-key.pem` - CA private key. **Keep safe.**
|
|
|
|
- `nomad-agent-ca.pem` - CA public certificate.
|
|
|
|
- `global-cli-nomad-key.pem` - Nomad CLI private key for the `global` region.
|
|
|
|
- `global-cli-nomad.pem` - Nomad CLI certificate for the `global` region.
|
|
|
|
- `global-client-nomad-key.pem` - Nomad client node private key for the
|
|
`global` region.
|
|
|
|
- `global-client-nomad.pem` - Nomad client node public certificate for the
|
|
`global` region.
|
|
|
|
- `global-server-nomad-key.pem` - Nomad server node private key for the
|
|
`global` region.
|
|
|
|
- `global-server-nomad.pem` - Nomad server node public certificate for the
|
|
`global` region.
|
|
|
|
Each Nomad node should have the appropriate key (`-key.pem`) and certificate
|
|
(`.pem`) file for its region and role. In addition each node needs the CA's
|
|
public certificate (`nomad-agent-ca.pem`).
|
|
|
|
## Configuring Nomad
|
|
|
|
Next Nomad must be configured to use the newly-created key and certificates for
|
|
mTLS. Starting with the [server configuration from the Getting Started
|
|
guide][guide-server] add the following TLS configuration options:
|
|
|
|
```hcl
|
|
# Increase log verbosity
|
|
log_level = "DEBUG"
|
|
|
|
# Setup data dir
|
|
data_dir = "/tmp/server1"
|
|
|
|
# Enable the server
|
|
server {
|
|
enabled = true
|
|
|
|
# Self-elect, should be 3 or 5 for production
|
|
bootstrap_expect = 1
|
|
}
|
|
|
|
# Require TLS
|
|
tls {
|
|
http = true
|
|
rpc = true
|
|
|
|
ca_file = "nomad-agent-ca.pem"
|
|
cert_file = "global-server-nomad.pem"
|
|
key_file = "global-server-nomad-key.pem"
|
|
|
|
verify_server_hostname = true
|
|
verify_https_client = true
|
|
}
|
|
```
|
|
|
|
The new [`tls`][tls_block] section is worth breaking down in more detail:
|
|
|
|
```hcl
|
|
tls {
|
|
http = true
|
|
rpc = true
|
|
# ...
|
|
}
|
|
```
|
|
|
|
This enables TLS for the HTTP and RPC protocols. Unlike web servers, Nomad
|
|
doesn't use separate ports for TLS and non-TLS traffic: your cluster should
|
|
either use TLS or not.
|
|
|
|
```hcl
|
|
tls {
|
|
# ...
|
|
|
|
ca_file = "nomad-agent-ca.pem"
|
|
cert_file = "global-server-nomad.pem"
|
|
key_file = "global-server-nomad-key.pem"
|
|
|
|
# ...
|
|
}
|
|
```
|
|
|
|
The file lines should point to wherever you placed the certificate files on
|
|
the node. This guide assumes they are in Nomad's current directory.
|
|
|
|
```hcl
|
|
tls {
|
|
# ...
|
|
|
|
verify_server_hostname = true
|
|
verify_https_client = true
|
|
}
|
|
```
|
|
|
|
These two settings are important for ensuring all of Nomad's mTLS security
|
|
properties are met. If [`verify_server_hostname`][verify_server_hostname] is
|
|
set to `false` the node's certificate will be checked to ensure it is signed by
|
|
the same CA, but its role and region will not be verified. This means any
|
|
service with a certificate signed by same CA as Nomad can act as a client or
|
|
server of any region.
|
|
|
|
[`verify_https_client`][verify_https_client] requires HTTP API clients to
|
|
present a certificate signed by the same CA as Nomad's certificate. It may be
|
|
disabled to allow HTTP API clients (e.g. Nomad CLI, Consul, or curl) to
|
|
communicate with the HTTPS API without presenting a client-side certificate. If
|
|
`verify_https_client` is enabled only HTTP API clients presenting a certificate
|
|
signed by the same CA as Nomad's certificate are allowed to access Nomad.
|
|
|
|
~> Enabling `verify_https_client` effectively protects Nomad from unauthorized
|
|
network access at the cost of losing Consul HTTPS health checks for agents.
|
|
|
|
### Client configuration
|
|
|
|
The Nomad client configuration is similar to the server configuration. The
|
|
biggest difference is in the certificate and key used for configuration.
|
|
|
|
```hcl
|
|
# Increase log verbosity
|
|
log_level = "DEBUG"
|
|
|
|
# Setup data dir
|
|
data_dir = "/tmp/client1"
|
|
|
|
# Enable the client
|
|
client {
|
|
enabled = true
|
|
|
|
# For demo assume you are talking to server1. For production,
|
|
# this should be like "nomad.service.consul:4647" and a system
|
|
# like Consul used for service discovery.
|
|
server_join {
|
|
retry_join = ["127.0.0.1:4647"]
|
|
}
|
|
}
|
|
|
|
# Modify our port to avoid a collision with server1
|
|
ports {
|
|
http = 5656
|
|
}
|
|
|
|
# Require TLS
|
|
tls {
|
|
http = true
|
|
rpc = true
|
|
|
|
ca_file = "nomad-agent-ca.pem"
|
|
cert_file = "global-client-nomad.pem"
|
|
key_file = "global-client-nomad-key.pem"
|
|
|
|
verify_server_hostname = true
|
|
verify_https_client = true
|
|
}
|
|
```
|
|
|
|
### Running with TLS
|
|
|
|
Now that you have certificates generated and configuration for a client and
|
|
server you can test our TLS-enabled cluster!
|
|
|
|
In separate terminals start a server and client agent:
|
|
|
|
In one terminal start a server process.
|
|
|
|
```shell-session
|
|
$ nomad agent -config server1.hcl
|
|
```
|
|
|
|
And in another terminal, start a client.
|
|
|
|
```shell-session
|
|
$ nomad agent -config client1.hcl
|
|
```
|
|
|
|
If you run `nomad node status` now, you'll get an error, like:
|
|
|
|
```plaintext
|
|
Error querying node status: Unexpected response code: 400 (Client sent an HTTP request to an HTTPS server.)
|
|
```
|
|
|
|
This is because the Nomad CLI defaults to communicating via HTTP instead of
|
|
HTTPS. You can configure the local Nomad client to connect using TLS and specify
|
|
our custom keys and certificates using the command line:
|
|
|
|
```shell-session
|
|
$ nomad node status \
|
|
-ca-cert=nomad-agent-ca.pem \
|
|
-client-cert=global-cli-nomad.pem \
|
|
-client-key=global-cli-nomad-key.pem \
|
|
-address=https://127.0.0.1:4646
|
|
```
|
|
|
|
This process can be cumbersome to type each time, so the Nomad CLI also
|
|
searches environment variables for default values. Use the following commands to
|
|
set environment variables in your shell.
|
|
|
|
`NOMAD_ADDR` is the URL of the Nomad agent and sets the default for `-addr`.
|
|
|
|
```shell-session
|
|
$ export NOMAD_ADDR=https://localhost:4646
|
|
```
|
|
|
|
`NOMAD_CACERT` is the location of your CA certificate and sets the default for `-ca-cert`.
|
|
|
|
```shell-session
|
|
$ export NOMAD_CACERT=nomad-agent-ca.pem
|
|
```
|
|
|
|
`NOMAD_CLIENT_CERT` is the location of your CLI certificate and sets the default for `-client-cert`.
|
|
|
|
```shell-session
|
|
$ export NOMAD_CLIENT_CERT=global-cli-nomad.pem
|
|
```
|
|
|
|
`NOMAD_CLIENT_KEY` is the location of your CLI key and sets the default for `-client-key`.
|
|
|
|
```shell-session
|
|
$ export NOMAD_CLIENT_KEY=global-cli-nomad-key.pem
|
|
```
|
|
|
|
After these environment variables are correctly configured, the CLI will respond as expected.
|
|
Run `nomad node status`.
|
|
|
|
```shell-session
|
|
$ nomad node status
|
|
ID DC Name Class Drain Eligibility Status
|
|
237cd4c5 dc1 nomad <none> false eligible ready
|
|
```
|
|
|
|
Or, generate and run a sample job.
|
|
|
|
```shell-session
|
|
$ nomad job init
|
|
Example job file written to example.nomad.hcl
|
|
```
|
|
|
|
```shell-session
|
|
$ nomad job run example.nomad.hcl
|
|
==> Monitoring evaluation "e9970e1d"
|
|
Evaluation triggered by job "example"
|
|
Allocation "a1f6c3e7" created: node "237cd4c5", group "cache"
|
|
Evaluation within deployment: "080460ce"
|
|
Evaluation status changed: "pending" -> "complete"
|
|
==> Evaluation "e9970e1d" finished with status "complete"
|
|
```
|
|
|
|
## Switching an existing cluster to TLS
|
|
|
|
Since Nomad does _not_ use different ports for TLS and non-TLS communication,
|
|
the use of TLS must be consistent across the cluster. Switching an existing
|
|
cluster to use TLS everywhere is operationally similar to upgrading between
|
|
versions of Nomad, but requires additional steps to preventing needlessly
|
|
rescheduling allocations.
|
|
|
|
1. Add the appropriate key and certificates to all nodes. Ensure the private
|
|
key file is only readable by the Nomad user.
|
|
1. Add the environment variables to all nodes where the CLI is used.
|
|
1. Add the appropriate [`tls`][tls_block] block to the configuration file on
|
|
all nodes.
|
|
1. Generate a gossip key and add it the Nomad server configuration.
|
|
|
|
~> Once a quorum of servers are TLS-enabled, clients will no longer be able to
|
|
communicate with the servers until their client configuration is updated and
|
|
reloaded.
|
|
|
|
At this point a rolling restart of the cluster will enable TLS everywhere.
|
|
However, once servers are restarted clients will be unable to heartbeat. This
|
|
means any client unable to restart with TLS enabled before their heartbeat TTL
|
|
expires will have their allocations marked as `lost` and rescheduled.
|
|
|
|
While the default heartbeat settings may be sufficient for concurrently
|
|
restarting a small number of nodes without any allocations being marked as
|
|
`lost`, most operators should raise the [`heartbeat_grace`][heartbeat_grace]
|
|
configuration setting before restarting their servers:
|
|
|
|
1. Set `heartbeat_grace = "1h"` or an appropriate duration on servers.
|
|
1. Restart servers, one at a time.
|
|
1. Restart clients, one or more at a time.
|
|
1. Set [`heartbeat_grace`][heartbeat_grace] back to its previous value (or
|
|
remove to accept the default).
|
|
1. Restart servers, one at a time.
|
|
|
|
~> In a future release Nomad will allow upgrading a cluster to use TLS by
|
|
allowing servers to accept TLS and non-TLS connections from clients during
|
|
the migration.
|
|
|
|
Jobs running in the cluster will _not_ be affected and will continue running
|
|
throughout the switch as long as all clients can restart within their heartbeat
|
|
TTL.
|
|
|
|
## Changing Nomad certificates on the fly
|
|
|
|
As of 0.7.1, Nomad supports dynamic certificate reloading via SIGHUP.
|
|
|
|
Given a prior TLS configuration as follows:
|
|
|
|
```hcl
|
|
tls {
|
|
http = true
|
|
rpc = true
|
|
|
|
ca_file = "nomad-ca.pem"
|
|
cert_file = "server.pem"
|
|
key_file = "server-key.pem"
|
|
|
|
verify_server_hostname = true
|
|
verify_https_client = true
|
|
}
|
|
```
|
|
|
|
Nomad's cert_file and key_file can be reloaded via SIGHUP by updating the TLS
|
|
stanza to:
|
|
|
|
```hcl
|
|
tls {
|
|
http = true
|
|
rpc = true
|
|
|
|
ca_file = "nomad-ca.pem"
|
|
cert_file = "new_server.pem"
|
|
key_file = "new_server_key.pem"
|
|
|
|
verify_server_hostname = true
|
|
verify_https_client = true
|
|
}
|
|
```
|
|
|
|
## Migrating a cluster to TLS
|
|
|
|
### Reloading TLS configuration via SIGHUP
|
|
|
|
Nomad supports dynamically reloading both client and server TLS configuration.
|
|
To reload an agent's TLS configuration, first update the TLS block in the
|
|
agent's configuration file and then send the Nomad agent a SIGHUP signal.
|
|
Note that this will only reload a subset of the configuration file,
|
|
including the TLS configuration.
|
|
|
|
The agent reloads all its network connections when there are changes to its TLS
|
|
configuration during a config reload via SIGHUP. Any new connections
|
|
established will use the updated configuration, and any outstanding old
|
|
connections will be closed. This process works when upgrading to TLS,
|
|
downgrading from it, as well as rolling certificates.
|
|
|
|
### RPC upgrade mode for Nomad servers
|
|
|
|
When migrating to TLS, the [`rpc_upgrade_mode`][rpc_upgrade_mode] option
|
|
(defaults to `false`) in the TLS configuration for a Nomad server can be set
|
|
to true. When set to true, servers will accept both TLS and non-TLS
|
|
connections. By accepting non-TLS connections, operators can upgrade clients
|
|
to TLS without the clients being marked as lost because the server is
|
|
rejecting the client connection due to the connection not being over TLS.
|
|
However, it is important to note that `rpc_upgrade_mode` should be used as a
|
|
temporary solution in the process of migration, and this option should be
|
|
re-set to false (meaning that the server will strictly accept only TLS
|
|
connections) once the entire cluster has been migrated.
|
|
|
|
[guide-install]: /nomad/tutorials/get-started/get-started-install
|
|
[guide-server]: https://raw.githubusercontent.com/hashicorp/nomad/master/demo/vagrant/server.hcl
|
|
[heartbeat_grace]: /nomad/docs/configuration/server#heartbeat_grace
|
|
[letsencrypt]: https://letsencrypt.org/
|
|
[nomad-repo]: https://github.com/hashicorp/nomad
|
|
[rpc_upgrade_mode]: /nomad/docs/configuration/tls#rpc_upgrade_mode
|
|
[tls]: https://en.wikipedia.org/wiki/Transport_Layer_Security
|
|
[tls_block]: /nomad/docs/configuration/tls
|
|
[vagrantfile]: https://raw.githubusercontent.com/hashicorp/nomad/master/demo/vagrant/Vagrantfile
|
|
[vault-pki]: /vault/docs/secrets/pki
|
|
[vault]: /vault/
|
|
[verify_https_client]: /nomad/docs/configuration/tls#verify_https_client
|
|
[verify_server_hostname]: /nomad/docs/configuration/tls#verify_server_hostname
|
|
[`tls ca create`]: /nomad/commands/tls/ca-create
|