5126 Commits

Author SHA1 Message Date
Tim Gross
7add04eb0f refactor: volume request modes to be generic between DHV/CSI (#24896)
When we implemented CSI, the types of the fields for access mode and attachment
mode on volume requests were defined with a prefix "CSI". This gets confusing
now that we have dynamic host volumes using the same fields. Fortunately the
original was a typedef on string, and the Go API in the `api` package just uses
strings directly, so we can change the name of the type without breaking
backwards compatibility for the msgpack wire format.

Update the names to `VolumeAccessMode` and `VolumeAttachmentMode`. Keep the CSI
and DHV specific value constant names for these fields (they aren't currently
1:1), so that we can easily differentiate in a given bit of code which values
are valid.

Ref: https://github.com/hashicorp/nomad/pull/24881#discussion_r1920702890
2025-01-24 10:37:48 -05:00
Piotr Kazmierczak
3d7e4fd634 client: always initialize node.HostVolumes map (#24910)
The default node configuration in the client should always set an empty
HostVolumes map. Otherwise callers can panic, e.g.,:

goroutine 179 [running]:
github.com/hashicorp/nomad/client/hostvolumemanager.UpdateVolumeMap({0x36042b0, 0xc000c62a80}, 0x0, {0xc000a802a0, 0xd}, 0xc000691940)
	github.com/hashicorp/nomad/client/hostvolumemanager/volume_fingerprint.go:43 +0x1b2
github.com/hashicorp/nomad/client.(*Client).batchFirstFingerprints.func1({0xc000a802a0, 0xd}, 0xc000691940)
	github.com/hashicorp/nomad/client/node_updater.go:54 +0xd7
github.com/hashicorp/nomad/client.(*batchNodeUpdates).batchHostVolumeUpdates(0xc000912608?, 0xc0009f2f88)
	github.com/hashicorp/nomad/client/node_updater.go:417 +0x152
github.com/hashicorp/nomad/client.(*Client).batchFirstFingerprints(0xc000c2d188)
	github.com/hashicorp/nomad/client/node_updater.go:53 +0x1c5
created by github.com/hashicorp/nomad/client.NewClient in goroutine 1
	github.com/hashicorp/nomad/client/client.go:557 +0x2069

is a panic of the HVM when restarting a client that doesn't have any static
host volumes, but does have a dynamic host volume.
2025-01-21 20:45:04 +01:00
James Rasell
689f935e0a services: Support TLS Skip Verify within Nomad service checks. (#24781)
Checks within a service using the Nomad provider can now utilise
the `tls_skip_verify` parameter.
2025-01-15 07:39:39 +00:00
Daniel Bennett
985eb53c65 dynamic host volumes: plugin spec tweaks (#24848)
* prefix plugin env vars with DHV_
* add env: DHV_VOLUME_ID, DHV_PLUGIN_DIR
* 5s timeout on fingerprint calls
2025-01-13 14:18:10 -06:00
Tim Gross
cca9a5320d testing: fix test flake in dynamic host volume client tests (#24836)
The output of `GetDynamicHostVolumes` is a slice but that slice is constructed
from iterating over a map and isn't sorted. Sort the output in the test to
eliminate a test flake.
2025-01-10 14:48:05 -05:00
Michael Smithhisler
606ce9dd90 deps: upgrade aws-sdk-go from v1 to v2 (#24720) 2025-01-09 17:27:19 -05:00
Tim Gross
4a65b21aab dynamic host volumes: send register to client for fingerprint (#24802)
When we register a volume without a plugin, we need to send a client RPC so that
the node fingerprint can be updated. The registered volume also needs to be
written to client state so that we can restore the fingerprint after a restart.

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-01-08 16:58:58 -05:00
Piotr Kazmierczak
7726ae68c6 client: move 'waiting for previous alloc to terminate' log messages to info (#24804) 2025-01-08 15:44:35 +01:00
Michael Smithhisler
34a34e7233 plugins: validate logmon process during reattach (#24798) 2025-01-08 08:50:33 -05:00
Tim Gross
08a6f870ad cni: use check command when restoring from restart (#24658)
When the Nomad client restarts and restores allocations, the network namespace
for an allocation may exist but no longer be correctly configured. For example,
if the host is rebooted and the task was a Docker task using a pause container,
the network namespace may be recreated by the docker daemon.

When we restore an allocation, use the CNI "check" command to verify that any
existing network namespace matches the expected configuration. This requires CNI
plugins of at least version 1.2.0 to avoid a bug in older plugin versions that
would cause the check to fail.

If the check fails, destroy the network namespace and try to recreate it from
scratch once. If that fails in the second pass, fail the restore so that the
allocation can be recreated (rather than silently having networking fail).

This should fix the gap left #24650 for Docker task drivers and any other
drivers with the `MustInitiateNetwork` capability.

Fixes: https://github.com/hashicorp/nomad/issues/24292
Ref: https://github.com/hashicorp/nomad/pull/24650
2025-01-07 09:38:39 -05:00
Daniel Bennett
a9ee66a6ef dynamic host volumes: unique volume name per node (#24748)
a node can have only one volume with a given name.

the scheduler prevents duplicates, but can only
do so after the server knows about the volume.
this prevents multiple concurrent creates being
called faster than the fingerprint/heartbeat interval.

users may still modify an existing volume only
if they set the `id` in the volume spec and
re-issue `nomad volume create`

if a *static* vol is added to config with a name
already being used by a dynamic volume, the
dynamic takes precedence, but log a warning.
2025-01-06 15:37:20 -06:00
Daniel Bennett
459453917e dynamic host volumes: client-side tests, comments, tidying (#24747) 2025-01-06 13:20:07 -06:00
Charles Z.
f7b12dc54e add noswap to secretdir tmpfs (#24645) 2025-01-06 09:44:43 -05:00
Daniel Bennett
af967184a6 dynamic host volumes: tweak plugin fingerprint (#24711)
Instead of a plugin `version` subcommand that responds with a string
(established in #24497), respond to a `fingerprint` command with a data
structure that we may extend in the future (such as plugin capabilities,
like size constraint support?). In the immediate term, it's still just the
version: `{"version": "0.0.1"}`

In addition to leaving the door open for future expansion, I think it will
also avoid false positives detecting executables that just happen to
respond to a `version` command.

This also reverses the ordering of the fingerprint string parts
from `plugins.host_volume.version.mkdir` (which aligned with CNI)
to `plugins.host_volume.mkdir.version` (makes more sense to me)
2024-12-19 09:25:55 -05:00
Daniel Bennett
e76f5e0b4c dynamic host volumes: volume fingerprinting (#24613)
and expand the demo a bit
2024-12-19 09:25:54 -05:00
Daniel Bennett
05f1cda594 dynamic host volumes: client state (#24595)
store dynamic host volume creations in client state,
so they can be "restored" on agent restart. restore works
by repeating the same Create operation as initial creation,
and expecting the plugin to be idempotent.

this is (potentially) especially important after host restarts,
which may have dropped mount points or such.
2024-12-19 09:25:54 -05:00
Daniel Bennett
46a39560bb dynamic host volumes: fingerprint client plugins (#24589) 2024-12-19 09:25:54 -05:00
Daniel Bennett
2b04d47ac2 dynamic host volumes: test client RPC and plugins (#24535)
also ensure that volume ID is uuid-shaped so user-provided input
like `id = "../../../"` which is used as part of the target directory
can not find its way very far into the volume submission process
2024-12-19 09:25:54 -05:00
Daniel Bennett
c2dd97dee7 HostVolumePlugin interface and two implementations (#24497)
* mkdir: HostVolumePluginMkdir: just creates a directory
* example-host-volume: HostVolumePluginExternal:
  plugin script that does mkfs and mount loopback

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-12-19 09:25:54 -05:00
Tim Gross
6a3803c31e dynamic host volumes: RPC handlers (#24373)
This changeset implements the RPC handlers for Dynamic Host Volumes, including
the plumbing needed to forward requests to clients. The client-side
implementation is stubbed and will be done under a separate PR.

Ref: https://hashicorp.atlassian.net/browse/NET-11549
2024-12-19 09:25:54 -05:00
Tim Gross
30e57c39b0 discovery: correctly handle IPv6 addresses from go-discover (#24649)
Nomad sets a default port when resolving server addresses that don't have
one. When we get a "bare" IPv6 address without a port, we end up with an
unexpected error "too many colons in address" when we try to split the address
and host, because the standard library function expects IPv6 addresses to be
wrapped in brackets as recommended by RFC5952. User-configured addresses avoid
this problem by accepting IP address and port as separate configuration values,
but go-discover emits "bare" IPv6 addresses without a port in IPv6 environments.

Fix this by adding brackets to IPv6 addresses when we get the "too many colons"
error from the stdlib. This will still give erroneous results if the address
includes the port but is missing brackets, but there's no way to unambiguously
parse that address.

Ref: https://www.rfc-editor.org/rfc/rfc5952
Fixes: https://github.com/hashicorp/nomad/issues/24608
2024-12-17 15:49:40 -05:00
Tim Gross
24fa7439df cni: use tmpfs location for ipam plugin (#24650)
When a Nomad host reboots, the network namespace files in the tmpfs in
`/var/run` are wiped out. So when we restore allocations after a host reboot, we
need to be able to restore both the network namespace and the network
configuration. But because the netns is newly created and we need to run the CNI
plugins again, this create potential conflicts with the IPAM plugin which has
written state to persistent disk at `/var/lib/cni`. These IPs aren't the ones
advertised to Consul, so there's no particular reason to keep them around after
a host reboot because all virtual interfaces need to be recreated too.

Reconfigure the CNI bridge configuration to use `/var/run/cni` as its state
directory. We already expect this location to be created by CNI because the
netns files are hard-coded to be created there too in `libcni`.

Note this does not fix the problem described for Docker in #24292 because that
appears to be related to the netns itself being restored unexpectedly from
Docker's state.

Ref: https://github.com/hashicorp/nomad/issues/24292#issuecomment-2537078584
Ref: https://www.cni.dev/plugins/current/ipam/host-local/#files
2024-12-16 09:36:35 -05:00
James Rasell
7d48aa2667 client: emit optional telemetry from prerun and prestart hooks. (#24556)
The Nomad client can now optionally emit telemetry data from the
prerun and prestart hooks. This allows operators to monitor and
alert on failures and time taken to complete.

The new datapoints are:
  - nomad.client.alloc_hook.prerun.success (counter)
  - nomad.client.alloc_hook.prerun.failed (counter)
  - nomad.client.alloc_hook.prerun.elapsed (sample)

  - nomad.client.task_hook.prestart.success (counter)
  - nomad.client.task_hook.prestart.failed (counter)
  - nomad.client.task_hook.prestart.elapsed (sample)

The hook execution time is useful to Nomad engineering and will
help optimize code where possible and understand job specification
impacts on hook performance.

Currently only the PreRun and PreStart hooks have telemetry
enabled, so we limit the number of new metrics being produced.
2024-12-12 14:43:14 +00:00
Piotr Kazmierczak
3a18f22c18 goflags: go:build linux for tests that won't compile on other platforms (#24559)
I'm a heavy LSP user and I frequently goto:next_error. This confuses my
editor on macOS.
2024-11-28 15:05:00 +01:00
Piotr Kazmierczak
f7a4ded2c0 security: add CT executeTemplate to default function_denylist (#24541)
This PR adds Consul Template's executeTemplate function to the denylist by
default, in order to prevent accidental or malicious infinitely recursive
execution.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-11-22 19:33:56 +01:00
Martijn Vegter
997da25cdb scheduler: take all assigned cpu cores into account instead of only those part of the largest lifecycle (#24304)
Fixes a bug in the AllocatedResources.Comparable method, where the scheduler
would only take into account the cpusets of the tasks in the largest lifecycle.
This could result in overlapping cgroup cpusets. Now we make the distinction
between reserved and fungible resources throughout the lifespan of the alloc.
In addition, added logging in case of future regressions thus not requiring
manual inspection of cgroup files.
2024-11-21 13:21:48 -05:00
Martijn Vegter
bfb714144e client: fixed a bug where AMD CPUs were not correctly fingerprinting base speed (#24415)
Relates to: #19468
2024-11-21 09:08:47 -06:00
James Rasell
beb4097e81 client: mark the remote_task hook as deprecated. (#24505) 2024-11-20 15:32:50 +00:00
Florian Apolloner
0a343798b6 Add NOMAD_* variables to CNI args. Fixes #23830 (#24319)
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2024-11-19 12:48:48 -08:00
Tim Gross
a420732424 consul: allow non-root Nomad to rewrite token (#24410)
When a task restarts, the Nomad client may need to rewrite the Consul token, but
it's created with permissions that prevent a non-root agent from writing to
it. While Nomad clients should be run as root (currently), it's harmless to
allow whatever user the Nomad agent is running as to be able to write to it, and
that's one less barrier to rootless Nomad.

Ref: https://github.com/hashicorp/nomad/issues/23859#issuecomment-2465757392
2024-11-19 10:21:14 -05:00
Gabi
89c3d69d79 nsutil: wrap error that comes from the syscall so caller can do errors.As (#24480)
User of `nsutil` library should be able to do the following and for it
to work:

```
  var errno syscall.Errno
   if errors.As(err, &errno) {
       if errno == unix.EBUSY { ... }
   }
```

This commit fixes that issue.
2024-11-19 10:24:49 +01:00
Tim Gross
6be9a50626 vault: catch expired lease as fatal error (#24409)
When a Vault lease expires, it's revoked on the server and cannot be removed, so
this error should be treated as fatal.

The errors we get aren't wrapped by the Vault SDK, so unfortunately we have to
read the error messages and can't easily enumerate non-fatal error
messages (which might be bubbling up from the stdlib). I've audited the errors
currently used and have documented their source.

Ref 52ba156d47/vault/expiration.go (L1327)
Fixes: https://github.com/hashicorp/nomad/issues/23859
2024-11-18 09:12:35 -05:00
Michael Smithhisler
0714353324 fix: handle template re-renders on client restart (#24399)
When multiple templates with api functions are included in a task, it's
possible for consul-template to re-render templates as it creates
watchers, overwriting render event data. This change uses event fields
that do not get overwritten, and only executes the change mode for
templates that were actually written to disk.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-11-08 12:49:38 -05:00
Seth Hoenig
4ef4bebd1f connect: handle grpc_address as gosockaddr/template string (#24280)
* connect: handle grpc_address as gosockaddr/template string

This PR fixes a bug where the consul.grpc_address could not be set using
a go-sockaddr/template string. This was inconsistent with how we do accept
such strings for consul.address values.

* add changelog
2024-11-07 09:04:58 -06:00
James Rasell
c44f933aeb test: ensure RPC only test client sets enterprise specific config. (#24376) 2024-11-06 13:43:25 +00:00
Tim Gross
a8b84a6eed testing: RPC-only test client helper (#24371)
In #10193 we introduced a testing helper that spins up a client RPC server
without the rest of the client operations so that we can make server-side client
RPC tests lighter. But this wasn't actually ever wired up to the intended
target. While working on Dynamic Host Volumes I noticed that this would be
useful for RPC tests.

This changeset fixes some bugs in the helper that arose from client code drift,
and makes it used by the client RPC tests for CSI. This will also get used for
the DHV RPC tests.

Ref: https://github.com/hashicorp/nomad/pull/10193
2024-11-05 14:59:53 -05:00
Juanadelacuesta
d0b015ec01 func: move the user andd group type declarations 2024-10-31 10:34:26 +01:00
Juanadelacuesta
0cd1b5ff13 func: move the validation to a dependency and use id sets 2024-10-28 18:59:51 +01:00
Rodrigo Lourenço
cdebf96b0e fingerprint gce: collect preemptibility 2024-10-23 15:19:20 +02:00
Seth Hoenig
f1ce127524 jobspec: add a chown option to artifact block (#24157)
* jobspec: add a chown option to artifact block

This PR adds a boolean 'chown' field to the artifact block.

It indicates whether the Nomad client should chown the downloaded files
and directories to be owned by the task.user. This is useful for drivers
like raw_exec and exec2 which are subject to the host filesystem user
permissions structure. Before, these drivers might not be able to use or
manage the downloaded artifacts since they would be owned by the root
user on a typical Nomad client configuration.

* api: no need for pointer of chown field
2024-10-11 11:30:27 -05:00
Tim Gross
b7595c646d alloc fs: use case-insensitive check for reads of secret/private dir (#24125)
When using the Client FS APIs, we check to ensure that reads don't traverse into
the allocation's secret dir and private dir. But this check can be bypassed on
case-insensitive file systems (ex. Windows, macOS, and Linux with obscure ext4
options enabled). This allows a user with `read-fs` permissions but not
`alloc-exec` permissions to read from the secrets dir.

This changeset updates the check so that it's case-insensitive. This risks false
positives for escape (see linked Go issue), but only if a task without
filesystem isolation deliberately writes into the task working directory to do
so, which is a fail-safe failure mode.

Ref: https://github.com/golang/go/issues/18358

Co-authored-by: dduzgun-security <deniz.duzgun@hashicorp.com>
2024-10-03 14:20:24 -04:00
Martijn Vegter
3ecf0d21e2 metrics: introduce client config to include alloc metadata as part of the base labels (#23964) 2024-10-02 10:55:44 -04:00
Juliano Martinez
4a74fda8ce Allow client template config block to be parsed when using json config (#24007)
- Adds tests
- Adds sample test data for parsing hcl and json
- Adds changelog
2024-10-01 15:44:36 -04:00
Piotr Kazmierczak
981ca36049 docker: use official client instead of fsouza/go-dockerclient (#23966)
This PR replaces fsouza/go-dockerclient 3rd party docker client library with
docker's official SDK.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
2024-09-26 18:41:44 +02:00
Tim Gross
cc9227b858 template: fix panic in change_mode=script on client restart (#24057)
When we introduced change_mode=script to templates, we passed the driver handle
down into the template manager so we could call its `Exec` method directly. But
the lifecycle of the driver handle is managed by the taskrunner and isn't
available when the template manager is first created. This has led to a series
of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part
of the challenge in getting this right is using an interface to avoid the
circular import of the driver handle.

But the taskrunner already has a way to deal with this problem using a "lazy
handle". The other template change modes already use this indirectly through the
`Lifecycle` interface. Change the driver handle `Exec` call in the template
manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This
eliminates the need for the template manager to know anything at all about the
handle state.

Fixes: https://github.com/hashicorp/nomad/issues/24051
2024-09-25 08:59:01 -04:00
Michael Smithhisler
338487c159 fix: add node pool attribute to interpretable values in task env (#24052) 2024-09-24 13:23:16 -04:00
Michael Smithhisler
6b6aa7cc26 identity: adds ability to specify custom filepath for saving workload identities (#24038) 2024-09-23 10:27:00 -04:00
Tim Gross
b7f1800657 fingerprint: update landlock test to accept v4+ APIs (#23979)
The landlock fingerprint test assumes there's no version of the landlock API
>3. Update the test assertion to allow for the current v4 and any future
versions.
2024-09-17 15:07:44 -04:00
Seth Hoenig
51215bf102 deps: update to go-set/v3 and refactor to use custom iterators (#23971)
* deps: update to go-set/v3

* deps: use custom set iterators for looping
2024-09-16 13:40:10 -05:00
Daniel Bennett
5e1fae2856 networking: set alloc NetworkStatus.AddressIPv6 (#23959)
when a CNI result includes an IPv6 address,
set it on the alloc's NetworkStatus for reference.

e.g.:

$ nomad alloc status -json 3dca | jq '.NetworkStatus'
{
  "Address": "172.26.64.14",
  "AddressIPv6": "fd00:a110:c8::b",
  "DNS": null,
  "InterfaceName": "eth0"
}
2024-09-16 10:21:52 -05:00