When we add a Sentinel scope for dynamic host volumes, having a default `-scope`
value for `sentinel apply` risks accidentally adding policies for volumes to the
job scope. This would immediately prevent any job from being submitted. Forcing
the administrator to pass a `-scope` will prevent accidental misuse.
Ref: https://github.com/hashicorp/nomad-enterprise/pull/2087
Ref: https://github.com/hashicorp/nomad/pull/24479
The create/register volume RPCs support a policy override flag for
soft-mandatory Sentinel policies, but the CLI and Go API were missing support
for it.
Also add support for Sentinel warnings to the Go API and CLI.
Ref: https://github.com/hashicorp/nomad/pull/24479
In #24528 we added monitoring to the CLI for dynamic host volume creation. But
when the volume's namespace is set by the volume specification instead of the
`-namespace` flag, the API client doesn't have the right namespace and gets a
404 when setting up the monitoring. The specification always overrides the
`-namespace` flag, so use that when available for all subsequent API calls.
Ref: https://github.com/hashicorp/nomad/pull/24479
Adds dynamic host volumes to argument autocomplete for the `volume status` and
`volume delete` commands. Adds flag autocompletion for those commands plus
`volume create`.
Ref: https://github.com/hashicorp/nomad/pull/24479
Most Nomad upsert RPCs accept a single object with the notable exception of
CSI. But in CSI we don't actually expose this to users except through the Go
API. It deeply complicates how we present errors to users, especially once
Sentinel policy enforcement enters the mix.
Refactor the `HostVolume.Create` and `HostVolume.Register` RPCs to take a single
volume instead of a slice of volumes.
Add a stub function for Enterprise policy enforcement. This requires splitting
out placement from the `createVolume` function so that we can ensure we've
completed placement before trying to enforce policy.
Ref: https://github.com/hashicorp/nomad/pull/24479
Add support for dynamic host volumes to the search endpoint. Like many other
objects with UUID identifiers, we're not supporting fuzzy search here, just
prefix search on the fuzzy search endpoint.
Because the search endpoint only returns IDs, we need to seperate CSI volumes
and host volumes for it to be useful. The new context is called `"host_volumes"`
to disambiguate it from `"volumes"`. In future versions of Nomad we should
consider deprecating the `"volumes"` context in lieu of a `"csi_volumes"`
context.
Ref: https://github.com/hashicorp/nomad/pull/24479
also ensure that volume ID is uuid-shaped so user-provided input
like `id = "../../../"` which is used as part of the target directory
can not find its way very far into the volume submission process
When dynamic host volumes are created, they're written to the state store in a
"pending" state. Once the client fingerprints the volume it's eligible for
scheduling, so we mark the state as ready at that point.
Because the fingerprint could potentially be returned before the RPC handler has
a chance to write to the state store, this changeset adds test coverage to
verify that upserts of pending volumes check the node for a
previously-fingerprinted volume as well.
Ref: https://github.com/hashicorp/nomad/pull/24479
When making a request to create a dynamic host volumes, users can pass a node
pool and constraints instead of a specific node ID.
This changeset implements a node scheduling logic by instantiating a filter by
node pool and constraint checker borrowed from the scheduler package. Because
host volumes with the same name can't land on the same host, we don't need to
support `distinct_hosts`/`distinct_property`; this would be challenging anyways
without building out a much larger node iteration mechanism to keep track of
usage across multiple hosts.
Ref: https://github.com/hashicorp/nomad/pull/24479
* mkdir: HostVolumePluginMkdir: just creates a directory
* example-host-volume: HostVolumePluginExternal:
plugin script that does mkfs and mount loopback
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Add several validation steps in the create/register RPCs for dynamic host
volumes. We first check that submitted volumes are self-consistent (ex. max
capacity is more than min capacity), then that any updates we've made are
valid. And we validate against state: preventing claimed volumes from being
updated and preventing placement requests for nodes that don't exist.
Ref: https://github.com/hashicorp/nomad/issues/15489
The `HostVolumeByID` state store method didn't add a watch channel to the
watchset, which meant that it would never unblock. The tests missed this because
they were racy, so move the updates for unblocking tests into a `time.After`
call to ensure the queries are blocked before the update happens.
This changeset implements the HTTP API endpoints for Dynamic Host Volumes.
The `GET /v1/volumes` endpoint is shared between CSI and DHV with a query
parameter for the type. In the interest of getting some working handlers
available for use in development (and minimizing the size of the diff to
review), this changeset doesn't do any sort of refactoring of how the existing
List Volumes CSI endpoint works. That will come in a later PR, as will the
corresponding `api` package updates we need to support the CLI.
Ref: https://hashicorp.atlassian.net/browse/NET-11549
This changeset implements the RPC handlers for Dynamic Host Volumes, including
the plumbing needed to forward requests to clients. The client-side
implementation is stubbed and will be done under a separate PR.
Ref: https://hashicorp.atlassian.net/browse/NET-11549
This changeset implements the ACLs required for dynamic host volumes RPCs:
* `host-volume-write` is a coarse-grained policy that implies all operations.
* `host-volume-register` is the highest fine-grained privilege because it
potentially bypasses quotas.
* `host-volume-create` is implicitly granted by `host-volume-register`
* `host-volume-delete` is implicitly granted only by `host-volume-write`
* `host-volume-read` is implicitly granted by `policy = "read"`,
These are namespaced operations, so the testing here is predominantly around
parsing and granting of implicit capabilities rather than the well-tested
`AllowNamespaceOperation` method.
This changeset does not include any changes to the `host_volumes` policy which
we'll need for claiming volumes on job submit. That'll be covered in a later PR.
Ref: https://hashicorp.atlassian.net/browse/NET-11549
The parameters used for the reusable action were incorrect since
the 5.0.1 update. The permissions were also incorrect as the
workflow needs to write to issues and PRs.
This change creates a reusable workflow for notifying Slack on CI
failures. The message will include useful links and information
about the failure, so product engineers can investigate and fix
any problems.
The new workflow is used by selected workflows which trigger on
merges to main or release/* branches. The notification is only
sent on failure and when the event was a push (PR merge) meaning
the number of notifications should be minimal.
The aim is to help identify and draw attention to failure across
our release branches, in particular when automated processes
happen.
Nomad sets a default port when resolving server addresses that don't have
one. When we get a "bare" IPv6 address without a port, we end up with an
unexpected error "too many colons in address" when we try to split the address
and host, because the standard library function expects IPv6 addresses to be
wrapped in brackets as recommended by RFC5952. User-configured addresses avoid
this problem by accepting IP address and port as separate configuration values,
but go-discover emits "bare" IPv6 addresses without a port in IPv6 environments.
Fix this by adding brackets to IPv6 addresses when we get the "too many colons"
error from the stdlib. This will still give erroneous results if the address
includes the port but is missing brackets, but there's no way to unambiguously
parse that address.
Ref: https://www.rfc-editor.org/rfc/rfc5952
Fixes: https://github.com/hashicorp/nomad/issues/24608
* Experimenting with a generic meta job-part component
* Taskstate.task gets me every time
* continue-on-error false test
* continue-on-error back in, but explicit success check after exam
* Testfixes for new meta structure on tasks and groups
* Clean up test and dev code
* Fixes an issue where system jobs' status were set to Scaled Down when their allocs get garbage collected
* Added to aggregateAllocStatus acceptance test and changelog
In #16872 we added support for unix domain sockets, but this required mutating
the `Config` when parsing the address so as to remove the port number. In #23785
we fixed a bug where if the configuration was used across multiple clients that
mutation would happen multiple times and the address would be incorrectly
parsed.
When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have
line-of-sight to the client, we attempt to make a HTTP API call directly to the
client node. So we create a new API client from the same configuration and then
set the address. But in this case we copy the private `url` field and that
causes the URL parsing to be skipped for the new client.
This results in the region always being set to the string literal
`"global"` (because of mTLS handling code introduced all the way back in
4d3b75d867), unless the user has set the region specifically. This fails with
an error "no path to region" when the cluster isn't non-global and requests are
sent to a non-leader.
Arguably the "right" way of fixing this would be for `ClientConfig` not to
change the API client's region to `"global"` in the first place, but as this is
a public API and extremely longstanding behavior, it could potentially be a
breaking change for some downstream consumers. Instead, we'll avoid copying the
private `url` field so that the new address is re-parsed.
Fixes: https://github.com/hashicorp/nomad/issues/24635
Fixes: https://github.com/hashicorp/nomad/issues/24609
Ref: https://github.com/hashicorp/nomad/pull/16872
Ref: https://github.com/hashicorp/nomad/pull/23785
Ref: 4d3b75d867
When a Nomad host reboots, the network namespace files in the tmpfs in
`/var/run` are wiped out. So when we restore allocations after a host reboot, we
need to be able to restore both the network namespace and the network
configuration. But because the netns is newly created and we need to run the CNI
plugins again, this create potential conflicts with the IPAM plugin which has
written state to persistent disk at `/var/lib/cni`. These IPs aren't the ones
advertised to Consul, so there's no particular reason to keep them around after
a host reboot because all virtual interfaces need to be recreated too.
Reconfigure the CNI bridge configuration to use `/var/run/cni` as its state
directory. We already expect this location to be created by CNI because the
netns files are hard-coded to be created there too in `libcni`.
Note this does not fix the problem described for Docker in #24292 because that
appears to be related to the netns itself being restored unexpectedly from
Docker's state.
Ref: https://github.com/hashicorp/nomad/issues/24292#issuecomment-2537078584
Ref: https://www.cni.dev/plugins/current/ipam/host-local/#files
* func: make paths relative
* func: make paths relative to the module inside the e2e terraform folder
* fix: add license files to gitignore
* func: move /etc and update all paths
* Uncomment forgotten code
* fix: update the path to the tls certificates to be local to the instance
The Nomad client can now optionally emit telemetry data from the
prerun and prestart hooks. This allows operators to monitor and
alert on failures and time taken to complete.
The new datapoints are:
- nomad.client.alloc_hook.prerun.success (counter)
- nomad.client.alloc_hook.prerun.failed (counter)
- nomad.client.alloc_hook.prerun.elapsed (sample)
- nomad.client.task_hook.prestart.success (counter)
- nomad.client.task_hook.prestart.failed (counter)
- nomad.client.task_hook.prestart.elapsed (sample)
The hook execution time is useful to Nomad engineering and will
help optimize code where possible and understand job specification
impacts on hook performance.
Currently only the PreRun and PreStart hooks have telemetry
enabled, so we limit the number of new metrics being produced.