When the Nomad client restarts and restores allocations, the network namespace
for an allocation may exist but no longer be correctly configured. For example,
if the host is rebooted and the task was a Docker task using a pause container,
the network namespace may be recreated by the docker daemon.
When we restore an allocation, use the CNI "check" command to verify that any
existing network namespace matches the expected configuration. This requires CNI
plugins of at least version 1.2.0 to avoid a bug in older plugin versions that
would cause the check to fail.
If the check fails, destroy the network namespace and try to recreate it from
scratch once. If that fails in the second pass, fail the restore so that the
allocation can be recreated (rather than silently having networking fail).
This should fix the gap left #24650 for Docker task drivers and any other
drivers with the `MustInitiateNetwork` capability.
Fixes: https://github.com/hashicorp/nomad/issues/24292
Ref: https://github.com/hashicorp/nomad/pull/24650
This PR adds Consul Template's executeTemplate function to the denylist by
default, in order to prevent accidental or malicious infinitely recursive
execution.
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
A more comprehensive env.denylist that now includes more token, token file and
license variables.
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
Core scheduler relies on a special table in the state store—the TimeTable—to
figure out which objects can be GC'd. The TimeTable correlates Raft indices
with objects insertion time, a solution we used before most of the objects we
store in the state contained timestamps. This introduced a bit of a memory
overhead and complexity, but most importantly meant that any GC threshold users
set greater than timeTableLimit = 72 * time.Hour was ignored. This PR removes
the TimeTable and relies on object timestamps to determine whether they could
be GCd or not.
Nomad v1.9.0 (finally!) removes support for HCL1 and the `-hcl1` flag.
See #23912 for details.
One of the uses of HCL1 over HCL2 was that HCL1 allowed quoted keys in
blocks such as env, meta, and Docker's labels:
```hcl
some_block {
"foo.bar" = "baz"
}
```
This works in HCL1 but is invalid HCL2. In HCL2 you must use a map
instead of a block:
```hcl
some_map = {
"eggs.spam" = "works!"
}
```
This was such a hassle for users we special cased the `env` and `meta`
blocks to be accepted as blocks or maps in #9936.
However Docker `labels`, being a task config option, is much harder to
special case and commonly needs dots-in-keys for things like DataDog
autodiscovery via Docker container labels:
https://docs.datadoghq.com/containers/docker/integrations/?tab=labels
Luckily `labels` can be specified as a list-of-maps instead:
```hcl
labels = [
{
"com.datadoghq.ad.check_names" = "[\"openmetrics\"]"
"com.datadoghq.ad.init_configs" = "[{}]"
}
]
```
So instead of adding more awkward hcl1/2 backward compat code to Nomad,
I just updated the docs to hopefully help people hit by this.
The only other known workaround is dropping HCL in favor of JSON
jobspecs altogether, but that forces a huge migration and maintenance
burden on users:
https://discuss.hashicorp.com/t/docker-based-autodiscovery-with-datadog-how-can-we-make-it-work/18870
As of Nomad 1.6.0, Nomad client agents send their secret with all the
RPCs (other than registration). But for backwards compatibility we had to keep
a legacy auth method that didn't require the node secret. We've previously
announced that this legacy auth method would be removed and that nodes older
than 1.6.0 would not be supported with Nomad 1.9.0.
This changeset removes the legacy auth method.
Ref: https://developer.hashicorp.com/nomad/docs/release-notes/nomad/upcoming#nomad-1-9-0
Add a section to the docs describing planned upcoming deprecations and
removals. Also added some missing upgrade guide sections missed during the last
release.
When a root key is rotated, the servers immediately start signing Workload
Identities with the new active key. But workloads may be using those WI tokens
to sign into external services, which may not have had time to fetch the new
public key and which might try to fetch new keys as needed.
Add support for prepublishing keys. Prepublished keys will be visible in the
JWKS endpoint but will not be used for signing or encryption until their
`PublishTime`. Update the periodic key rotation to prepublish keys at half the
`root_key_rotation_threshold` window, and promote prepublished keys to active
after the `PublishTime`.
This changeset also fixes two bugs in periodic root key rotation and garbage
collection, both of which can't be safely fixed without implementing
prepublishing:
* Periodic root key rotation would never happen because the default
`root_key_rotation_threshold` of 720h exceeds the 72h maximum window of the FSM
time table. We now compare the `CreateTime` against the wall clock time instead
of the time table. (We expect to remove the time table in future work, ref
https://github.com/hashicorp/nomad/issues/16359)
* Root key garbage collection could GC keys that were used to sign
identities. We now wait until `root_key_rotation_threshold` +
`root_key_gc_threshold` before GC'ing a key.
* When rekeying a root key, the core job did not mark the key as inactive after
the rekey was complete.
Ref: https://hashicorp.atlassian.net/browse/NET-10398
Ref: https://hashicorp.atlassian.net/browse/NET-10280
Fixes: https://github.com/hashicorp/nomad/issues/19669
Fixes: https://github.com/hashicorp/nomad/issues/23528
Fixes: https://github.com/hashicorp/nomad/issues/19368
This enables checks for ContainerAdmin user on docker images on Windows. It's
only checked if users run docker with process isolation and not hyper-v,
because hyper-v provides its own, proper sandboxing.
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
This PR adds a job mutator which injects constraints on the job taskgroups
that make use of bridge networking. Creating a bridge network makes use of the
CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting
with Nomad 1.5 these plugins are fingerprinted on each node, and as such we
can ensure jobs are correctly scheduled only on nodes where they are available,
when needed.
Nomad has always placed an extremely high priority on backward
compatibility. We have always aimed to support N-2 major releases and
usually gone above and beyond that.
The new https://www.hashicorp.com/long-term-support policy also mentions
that N-2 is what we have always supported, so it's probably time for our
docs to reflect that reality.
Nomad load all plugins from `plugin_dir` regardless if it is listed in
the agent configuration file. This can cause unexpected binaries to be
executed.
This commit begins the deprecation process of this behaviour. The Nomad
agent will emit a warning log for every plugin binary found without a
corresponding agent configuration block.
---------
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
* Update distinct_host feasibility checking to honor the job's namespace. Fixes#9792
* Added test to verify original condition and that fix resolved it.
* Added documentation
The Consul and Vault integrations work shipping in Nomad 1.7 will deprecated the
existing token-based workflows. These will be removed in Nomad 1.9, so add a
note describing this to the upgrade guide.
An ACL policy with a block without label generates unexpected results.
For example, a policy such as this:
```
namespace {
policy = "read"
}
```
Is applied to a namespace called `policy` instead of the documented
behaviour of applying it to the `default` namespace.
This happens because of the way HCL1 decodes blocks. Since it doesn't
know if a block is expected to have a label it applies the `key` tag to
the content of the block and, in the example above, the first key is
`policy`, so it sets that as the `namespace` block label.
Since this happens internally in the HCL decoder it's not possible to
detect the problem externally.
Fixing the problem inside the decoder is challenging because the JSON
and HCL parsers generate different ASTs that makes impossible to
differentiate between a JSON tree from an invalid HCL tree within the
decoder.
The fix in this commit consists of manually parsing the policy after
decoding to clear labels that were not set in the file. This allows the
validation rules to consistently catch and return any errors, no matter
if the policy is an invalid HCL or JSON.
The 32-bit Intel builds (aka "386") are not tested and likely have bugs
involving platform-sized integers when operated at any non-trivial scale. Remove
these builds from the upcoming Nomad 1.6.0 and provide recommendations in the
upgrade notes for those users who might have hobbyist boards running 32-bit
ARM (this will primarily be the RaspberryPi Zero or older spins of the RaspPi).
DO NOT BACKPORT TO 1.5.x OR EARLIER!
The `nomad tls cert` command did not create certificates with the correct SANs for
them to work with non default domain and region names. This changset updates the
code to support non default domains and regions in the certificates.
Nomad 1.5.4 shipped with a logmon bug that we rolled out a fix for in Nomad
1.5.5. Unfortunately we can't yank the release but we should leave a note in the
upgrade guide telling users to avoid it.
Adds a new configuration to clients to optionally allow them to drain their
workloads on shutdown. The client sends the `Node.UpdateDrain` RPC targeting
itself and then monitors the drain state as seen by the server until the drain
is complete or the deadline expires. If it loses connection with the server, it
will monitor local client status instead to ensure allocations are stopped
before exiting.
The job evaluate endpoint creates a new evaluation for the job which is
a write operation. This change modifies the necessary capability from
`read-job` to `submit-job` to better reflect this.
* client: disable running artifact downloader as nobody
This PR reverts a change from Nomad 1.5 where artifact downloads were
executed as the nobody user on Linux systems. This was done as an attempt
to improve the security model of artifact downloading where third party
tools such as git or mercurial would be run as the root user with all
the security implications thereof.
However, doing so conflicts with Nomad's own advice for securing the
Client data directory - which when setup with the recommended directory
permissions structure prevents artifact downloads from working as intended.
Artifact downloads are at least still now executed as a child process of
the Nomad agent, and on modern Linux systems make use of the kernel Landlock
feature for limiting filesystem access of the child process.
* docs: update upgrade guide for 1.5.1 sandboxing
* docs: add cl
* docs: add title to upgrade guide fix
The panic bug for upgrades with older servers that shipped in 1.4.0 was fixed in
1.4.1, which makes the versions described in the warning in the upgrade guide
misleading. Clarify the upgrade guide.
* artifact: protect against unbounded artifact decompression
Starting with 1.5.0, set defaut values for artifact decompression limits.
artifact.decompression_size_limit (default "100GB") - the maximum amount of
data that will be decompressed before triggering an error and cancelling
the operation
artifact.decompression_file_count_limit (default 4096) - the maximum number
of files that will be decompressed before triggering an error and
cancelling the operation.
* artifact: assert limits cannot be nil in validation