The agent syslog write handler was unable to handle JSON log lines
correctly, meaning all syslog entries when using JSON log format
showed as NOTICE level.
This change adds a new handler to the Nomad agent which can parse
JSON log lines and correctly understand the expected log level
entry.
The change also removes the use of a filter from the default log
format handler. This is not needed as the logs are fed into the
syslog handler via hclog, which is responsible for level
filtering.
We can reduce the amount of volume specification configuration many users will
need by setting a default capability on a dynamic host volume if none is
set. The default capability will allow using the volume in read/write mode on
its node, with no further restrictions except those that might be set in the
jobspec.
Enterprise governance checks happen after dynamic host volumes are placed, so if
node pool governance is active and you don't set a node pool or node ID for a
volume, it's possible to get a placement that fails node pool governance even
though there might be other nodes in the cluster that would be valid placements.
Move the node pool governance for host volumes into the placement path, so that
we're checking a specific node pool when node pool or node ID are set, but
otherwise filtering out candidate nodes by node pool.
This changset is the CE version of ENT/2200.
Ref: https://hashicorp.atlassian.net/browse/NET-11549
Ref: https://github.com/hashicorp/nomad-enterprise/pull/2200
Update dynamic host volume validation and update logic to allow for changes to
the node pool and plugin ID. If the client's node pool changes we'll sync up the
correct node pool for the volumes already placed on that client. We'll also
allow the plugin ID to be changed to allow for new versions of plugins
supporting the same volume over time.
The nightly runs for E2E have been failing the recently added dynamic host
volumes tests for a number of reasons:
* Adding timing logs to the tests shows that it can take over 5s (the original
test timeout) for the client fingerprint to show up on the client. This seems
like a lot but seems to be host-dependent because it's much faster locally.
Extend the timeout and leave in the timing logs so that we can keep an eye on
this problem in the future.
* The register test doesn't wait for the dispatched job to complete, and the
dispatched job was actually broken when TLS was in use because we weren't using
the Task API socket. Fix the jobspec for the dispatched job and add waiting
for the dispatched allocation to be marked complete before checking for the
volume on the server.
I've also change both the mounter jobs to batch workloads, so that we don't have
to wait 10s for the deployment to complete.
In #24694 we did a major refactoring of the E2E Terraform configuration. After
deploying a cluster this morning, I noticed a few moved/removed files were not
reflected in the .gitignore files. This changeset updates the .gitignore to have
no unstaged files after applying.
When using the register workflow, `capacity_max` is ignored so is likely
unset. If the volume is then updated later, the check we had for valid updates
assumes that the value was previously. Only perform this check if the value is
set.
We changed the list of access modes available for dynamic host volumes in #24705
but neglected to change them in the API package. Update the API package to
match.
Ref: https://github.com/hashicorp/nomad/pull/24705
* func: move infra provisionining to a module and remove providers
* func: update paths
* func: update more paths
* func: update path inside bootstrap scrip
* style: remove debug prints on bootstrap scripts
* Delete e2e/terraform/csi/input/volume-efs.hcl
* fix: update keys path to use module path instead pf root
* fix: add missing headers
* fix: update keys directory inside provision-nomad
* style; format hcl files
* Update compute.tf
* Update e2e/terraform/main.tf
Co-authored-by: Tim Gross <tgross@hashicorp.com>
* Update e2e/terraform/provision-infra/compute.tf
Co-authored-by: Tim Gross <tgross@hashicorp.com>
* fix: update more paths
* fix: fmt hcl files
* func: final paths revision for running e2e locally
* fix: make path of certs relative to module for the bootstrap
* func: final paths revision for running e2e locally
* Update network.tf
* fix: fix typo and add success message
* fix: remove the test name from token to avoid long names and use name for vol to avoid colisions
* func: unify the uploads folder
* func: make the uploads file one per cluster
* func: Add outputs with all data necessary to connect to the cluster
* fix: make nomad token a sensitive output
* Update bootstrap-nomad.sh
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
In anticipation of having quotas for dynamic host volumes, we want the user
experience of the storage limits to feel integrated with the other resource
limits. This is currently prevented by reusing the `Resources` type instead of
having a specific type for `QuotaResources`.
Update the quota limit/usage types to use a `QuotaResources` that includes a new
storage resources quota block. The wire format for the two types are compatible
such that we can migrate the existing variables limit in the FSM.
Also fixes improper parallelism in the quota init test where we change working
directory to avoid file write conflicts but this breaks when multiple tests are
executed in the same process.
Ref: https://github.com/hashicorp/nomad-enterprise/pull/2096
The output of `GetDynamicHostVolumes` is a slice but that slice is constructed
from iterating over a map and isn't sorted. Sort the output in the test to
eliminate a test flake.
Some comment cleanups as we're wrapping up dynamic host volumes work:
* We're not going to implement mount_options for host volumes, as the dynamic
host volumes don't have the equivalent of the stage/publish phase that CSI
volumes do. Users who want that sort of thing will pass them as `parameter`
field during volume create/register.
* The scheduler feasibility check prevents a dynamic host volume being claimed
by a job in the wrong namespace, but the comment incorrectly identifies that
code path as only being about the race between fingerprint and delete. Update
the comment to make the intent clear so that we don't accidentally remove this
behavior in the future.
* Update who-uses-nomad.mdx
Our new contract with Roblox states that we can't mention anywhere on our sites that they use us.
* Update who-uses-nomad.mdx
Edited the sentence above the companies list to more accurately reflect them.
Also added Target to the list with a link to their case study.
Initial end-to-end tests for dynamic host volumes. This includes tests for two
workflows:
* One where a dynamic host volume is created by a plugin and then mounted by a job.
* Another where a dynamic host volume is created out-of-band and registered by a
job, then mounted by another job.
This changeset also moves the existing `volumes` E2E test package to the
better-named `volume_mounts`.
Ref: https://hashicorp.atlassian.net/browse/NET-11551
When we register a volume without a plugin, we need to send a client RPC so that
the node fingerprint can be updated. The registered volume also needs to be
written to client state so that we can restore the fingerprint after a restart.
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
tl;dr - runtime code is fine but tests should match reality
The Nomad Client Agent is the only consumer of the
`Node.Derive{SI,Vault}Token` RPCs, therefore tests of the RPCs should
match Nomad Client behavior.
- DeriveVaultToken code: a9ee66a6ef/client/client.go (L2904-L2917)
- DeriveSIToken code: a9ee66a6ef/client/client.go (L2988-L2997)
Both of those client code paths include the Node SecretID in both the
request's SecretID field as well as the embedded
`QueryOptions.AuthToken` field.
This patch updates server tests to match that behavior. The tests pass
either way.
When the Nomad client restarts and restores allocations, the network namespace
for an allocation may exist but no longer be correctly configured. For example,
if the host is rebooted and the task was a Docker task using a pause container,
the network namespace may be recreated by the docker daemon.
When we restore an allocation, use the CNI "check" command to verify that any
existing network namespace matches the expected configuration. This requires CNI
plugins of at least version 1.2.0 to avoid a bug in older plugin versions that
would cause the check to fail.
If the check fails, destroy the network namespace and try to recreate it from
scratch once. If that fails in the second pass, fail the restore so that the
allocation can be recreated (rather than silently having networking fail).
This should fix the gap left #24650 for Docker task drivers and any other
drivers with the `MustInitiateNetwork` capability.
Fixes: https://github.com/hashicorp/nomad/issues/24292
Ref: https://github.com/hashicorp/nomad/pull/24650
Adds an additional check in the Keyring.Delete RPC to make sure we're not
trying to delete a key that's been used to encrypt a variable. It also adds a
-force flag for the CLI/API to sidestep that check.
The recent change to collection via a "one-shot" Docker API call
did not update the stream boolean argument. This results in the
PreCPUStats values being zero and therefore breaking the CPU
calculations which rely on this data. The base fix is to update
the passed boolean parameter to match the desired non-streaming
behaviour. The non-streaming API call correctly returns the
PreCPUStats data which can be seen in the added unit test.
The most recent change also modified the behaviour of the
collectStats go routine, so that any error encountered results in
the routine exiting. In the event this was a transient error, the
container will continue to run, however, no stats will be collected
until the task is stopped and replaced. This PR reverts the
behaviour, so that an error encountered during a stats collection
run results in the error being logged but the collection process
continuing with a backoff used.
a node can have only one volume with a given name.
the scheduler prevents duplicates, but can only
do so after the server knows about the volume.
this prevents multiple concurrent creates being
called faster than the fingerprint/heartbeat interval.
users may still modify an existing volume only
if they set the `id` in the volume spec and
re-issue `nomad volume create`
if a *static* vol is added to config with a name
already being used by a dynamic volume, the
dynamic takes precedence, but log a warning.