When a client restarts but can't restore a volume (ex. the plugin is now
missing), it's removed from the node fingerprint. So we won't allow future
scheduling of the volume, but we were not updating the volume state field to
report this reasoning to operators. Make debugging easier and the state field
more meaningful by setting the value to "unavailable".
Also, remove the unused "deleted" field. We did not implement soft deletes and
aren't planning on it for Nomad 1.10.0.
Ref: https://hashicorp.atlassian.net/browse/NET-11551
When we implemented CSI, the types of the fields for access mode and attachment
mode on volume requests were defined with a prefix "CSI". This gets confusing
now that we have dynamic host volumes using the same fields. Fortunately the
original was a typedef on string, and the Go API in the `api` package just uses
strings directly, so we can change the name of the type without breaking
backwards compatibility for the msgpack wire format.
Update the names to `VolumeAccessMode` and `VolumeAttachmentMode`. Keep the CSI
and DHV specific value constant names for these fields (they aren't currently
1:1), so that we can easily differentiate in a given bit of code which values
are valid.
Ref: https://github.com/hashicorp/nomad/pull/24881#discussion_r1920702890
Nomad Enterprise will utilise new reporting metrics and the
changes here allow this work to be conducted.
The server specific GetClientNodesCount function has been remomved
from CE as this is only called within enterprise code. A new
heartbeater function allows us to get the number of active timers,
which can be used by the heartbeater metrics and any other callers
that want this data.
When a volume is updated, we merge the new definition to the old. But the
volume's context comes from the plugin and is likely not present in the user's
volume specification. Which means that if the user re-submits the volume
specification to make an adjustment to the volume, we wipe out the context field
which might be required for subsequent operations by the CSI plugin. This was
discovered to be a problem with the Terraform provider and fixed there, but it's
also a problem for users of the `volume create` and `volume register` commands.
Update the merge so that we only overwrite the value of the context if it's been
explictly set by the user. We still need to support user-driven updates to
context for the `volume register` workflow.
Ref: https://github.com/hashicorp/terraform-provider-nomad/pull/503
Fixes: https://github.com/democratic-csi/democratic-csi/issues/438
The `volume init` command creates example volume specifications. But one of the
values for `capability.access_mode` is not a valid value. Correct the example to
match the validation logic.
The default node configuration in the client should always set an empty
HostVolumes map. Otherwise callers can panic, e.g.,:
goroutine 179 [running]:
github.com/hashicorp/nomad/client/hostvolumemanager.UpdateVolumeMap({0x36042b0, 0xc000c62a80}, 0x0, {0xc000a802a0, 0xd}, 0xc000691940)
github.com/hashicorp/nomad/client/hostvolumemanager/volume_fingerprint.go:43 +0x1b2
github.com/hashicorp/nomad/client.(*Client).batchFirstFingerprints.func1({0xc000a802a0, 0xd}, 0xc000691940)
github.com/hashicorp/nomad/client/node_updater.go:54 +0xd7
github.com/hashicorp/nomad/client.(*batchNodeUpdates).batchHostVolumeUpdates(0xc000912608?, 0xc0009f2f88)
github.com/hashicorp/nomad/client/node_updater.go:417 +0x152
github.com/hashicorp/nomad/client.(*Client).batchFirstFingerprints(0xc000c2d188)
github.com/hashicorp/nomad/client/node_updater.go:53 +0x1c5
created by github.com/hashicorp/nomad/client.NewClient in goroutine 1
github.com/hashicorp/nomad/client/client.go:557 +0x2069
is a panic of the HVM when restarting a client that doesn't have any static
host volumes, but does have a dynamic host volume.
* func: make windows arch dependant
* func: unify keys and make them cluster grouped
* Update README.md
* Update e2e/terraform/provision-infra/provision-nomad/variables.tf
Co-authored-by: Tim Gross <tgross@hashicorp.com>
* Update .gitignore
* style: add an output with the custer identifier
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Some dynamic host volumes are claimed by allocations with the capability we
borrowed from CSI called `single-node-single-writer`, which says only one
allocation can use the volume, and it can use it in read/write mode. We enforce
this in the scheduler, but if evaluations for different jobs were to be
processed concurrently by the scheduler, it's possible to get plans that would
fail to enforce this requirement. Add a check in the plan applier to ensure that
non-terminal allocations have exclusive access when requested.
let only one of any create/register/delete run at a time per volume ID
* plugins can assume that Nomad will not run concurrent operations for the same volume
* we avoid interleaving client RPCs with raft writes
The volume_mounts test is flaky due to slow starts from the exec-driver and some
incorrect wait code. Refactor the volume_mounts test to use the `e2e/v3` package
helpers, and use these to give it enough time to start the exec tasks.
The Nomad agent used a log filter to ensure logs were written at
the expected level. Since the use of hclog this is not required,
as hclog acts as the gate keeper and filter for logging. All log
writers accept messages from hclog which has already done the
filtering.
The agent syslog write handler was unable to handle JSON log lines
correctly, meaning all syslog entries when using JSON log format
showed as NOTICE level.
This change adds a new handler to the Nomad agent which can parse
JSON log lines and correctly understand the expected log level
entry.
The change also removes the use of a filter from the default log
format handler. This is not needed as the logs are fed into the
syslog handler via hclog, which is responsible for level
filtering.
We can reduce the amount of volume specification configuration many users will
need by setting a default capability on a dynamic host volume if none is
set. The default capability will allow using the volume in read/write mode on
its node, with no further restrictions except those that might be set in the
jobspec.
Enterprise governance checks happen after dynamic host volumes are placed, so if
node pool governance is active and you don't set a node pool or node ID for a
volume, it's possible to get a placement that fails node pool governance even
though there might be other nodes in the cluster that would be valid placements.
Move the node pool governance for host volumes into the placement path, so that
we're checking a specific node pool when node pool or node ID are set, but
otherwise filtering out candidate nodes by node pool.
This changset is the CE version of ENT/2200.
Ref: https://hashicorp.atlassian.net/browse/NET-11549
Ref: https://github.com/hashicorp/nomad-enterprise/pull/2200
Update dynamic host volume validation and update logic to allow for changes to
the node pool and plugin ID. If the client's node pool changes we'll sync up the
correct node pool for the volumes already placed on that client. We'll also
allow the plugin ID to be changed to allow for new versions of plugins
supporting the same volume over time.
The nightly runs for E2E have been failing the recently added dynamic host
volumes tests for a number of reasons:
* Adding timing logs to the tests shows that it can take over 5s (the original
test timeout) for the client fingerprint to show up on the client. This seems
like a lot but seems to be host-dependent because it's much faster locally.
Extend the timeout and leave in the timing logs so that we can keep an eye on
this problem in the future.
* The register test doesn't wait for the dispatched job to complete, and the
dispatched job was actually broken when TLS was in use because we weren't using
the Task API socket. Fix the jobspec for the dispatched job and add waiting
for the dispatched allocation to be marked complete before checking for the
volume on the server.
I've also change both the mounter jobs to batch workloads, so that we don't have
to wait 10s for the deployment to complete.
In #24694 we did a major refactoring of the E2E Terraform configuration. After
deploying a cluster this morning, I noticed a few moved/removed files were not
reflected in the .gitignore files. This changeset updates the .gitignore to have
no unstaged files after applying.
When using the register workflow, `capacity_max` is ignored so is likely
unset. If the volume is then updated later, the check we had for valid updates
assumes that the value was previously. Only perform this check if the value is
set.
We changed the list of access modes available for dynamic host volumes in #24705
but neglected to change them in the API package. Update the API package to
match.
Ref: https://github.com/hashicorp/nomad/pull/24705