nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-02 16:35:44 +03:00

Author	SHA1	Message	Date
Tim Gross	26004c5407	vault: set renew increment to lease duration (#26041 ) When we renew Vault tokens, we use the lease duration to determine how often to renew. But we also set an `increment` value which is never updated from the initial 30s. For periodic tokens this is not a problem because the `increment` field is ignored on renewal. But for non-periodic tokens this prevents the token TTL from being properly incremented. This behavior has been in place since the initial Vault client implementation in #1606 but before the switch to workload identity most (all?) tokens being created were periodic tokens so this was never detected. Fix this bug by updating the request's `increment` field to the lease duration on each renewal. Also switch out a `time.After` call in backoff of the derive token caller with a safe timer so that we don't have to spawn a new goroutine per loop, and have tighter control over when that's GC'd. Ref: https://github.com/hashicorp/nomad/pull/1606 Ref: https://github.com/hashicorp/nomad/issues/25812	2025-06-13 13:50:54 -04:00
Deniz Onur Duzgun	abd0efdd76	sec: remove non-hermetic sprig template functions (#25998 ) * sec:add sprig template functions in denylists * remove explicit set which is no longer needed * go mod tidy * add changelog * better changelog and filtered denylist * go mod tidy with 1.24.4 * edit changelog and remove htpasswd and derive * fix tests * Update client/allocrunner/taskrunner/template/template_test.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * edit changelog --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-06-09 13:00:47 -04:00
Michael Smithhisler	4c8257d0c7	client: add once mode to template block (#25922 )	2025-05-28 11:45:11 -04:00
Tim Gross	77c8acb422	telemetry: fix excessive CPU consumption in executor (#25870 ) Collecting metrics from processes is expensive, especially on platforms like Windows. The executor code has a 5s cache of stats to ensure that we don't thrash syscalls on nodes running many allocations. But the timestamp used to calculate TTL of this cache was never being set, so we were always treating it as expired. This causes excess CPU utilization on client nodes. Ensure that when we fill the cache, we set the timestamp. In testing on Windows, this reduces exector CPU overhead by roughly 75%. This changeset includes two other related items: * The `telemetry.publish_allocation_metrics` field correctly prevents a node from publishing metrics, but the stats hook on the taskrunner still collects the metrics, which can be expensive. Thread the configuration value into the stats hook so that we don't collect if `telemetry.publish_allocation_metrics = false`. * The `linuxProcStats` type in the executor's `procstats` package is misnamed as a result of a couple rounds of refactoring. It's used by all task executors, not just Linux. Rename this and move a comment about how Windows processes are listed so that the comment is closer to where the logic is implemented. Fixes: https://github.com/hashicorp/nomad/issues/23323 Fixes: https://hashicorp.atlassian.net/browse/NMD-455	2025-05-19 09:24:13 -04:00
James Rasell	4c4cb2c6ad	agent: Fix misaligned contextual k/v logging arguments. (#25629 ) Arguments passed to hclog log lines should always have an even number to provide the expected k/v output.	2025-04-10 14:40:21 +01:00
Tim Gross	e168548341	provide allocrunner hooks with prebuilt taskenv and fix mutation bugs (#25373 ) Some of our allocrunner hooks require a task environment for interpolating values based on the node or allocation. But several of the hooks accept an already-built environment or builder and then keep that in memory. Both of these retain a copy of all the node attributes and allocation metadata, which balloons memory usage until the allocation is GC'd. While we'd like to look into ways to avoid keeping the allocrunner around entirely (see #25372), for now we can significantly reduce memory usage by creating the task environment on-demand when calling allocrunner methods, rather than persisting it in the allocrunner hooks. In doing so, we uncover two other bugs: * The WID manager, the group service hook, and the checks hook have to interpolate services for specific tasks. They mutated a taskenv builder to do so, but each time they mutate the builder, they write to the same environment map. When a group has multiple tasks, it's possible for one task to set an environment variable that would then be interpolated in the service definition for another task if that task did not have that environment variable. Only the service definition interpolation is impacted. This does not leak env vars across running tasks, as each taskrunner has its own builder. To fix this, we move the `UpdateTask` method off the builder and onto the taskenv as the `WithTask` method. This makes a shallow copy of the taskenv with a deep clone of the environment map used for interpolation, and then overwrites the environment from the task. * The checks hook interpolates Nomad native service checks only on `Prerun` and not on `Update`. This could cause unexpected deregistration and registration of checks during in-place updates. To fix this, we make sure we interpolate in the `Update` method. I also bumped into an incorrectly implemented interface in the CSI hook. I've pulled that and some better guardrails out to https://github.com/hashicorp/nomad/pull/25472. Fixes: https://github.com/hashicorp/nomad/issues/25269 Fixes: https://hashicorp.atlassian.net/browse/NET-12310 Ref: https://github.com/hashicorp/nomad/issues/25372	2025-03-24 12:05:04 -04:00
Daniel Bennett	04db81951f	test: fix go 1.24 test complaints (#25346 ) e.g. Error: nomad/leader_test.go:382:12: non-constant format string in call to (*testing.common).Fatalf	2025-03-11 11:01:39 -05:00
Tim Gross	f3d53e3e2b	CSI: restart task on failing initial probe, instead of killing it (#25307 ) When a CSI plugin is launched, we probe it until the csi_plugin.health_timeout expires (by default 30s). But if the plugin never becomes healthy, we're not restarting the task as documented. Update the plugin supervisor to trigger a restart instead. We still exit the supervisor loop at that point to avoid having the supervisor send probes to a task that isn't running yet. This requires reworking the poststart hook to allow the supervisor loop to be restarted when the task restarts. In doing so, I identified that we weren't respecting the task kill context from the post start hook, which would leave the supervisor running in the window between when a task is killed because it failed and its stop hooks were triggered. Combine the two contexts to make sure we stop the supervisor whichever context gets closed first. Fixes: https://github.com/hashicorp/nomad/issues/25293 Ref: https://hashicorp.atlassian.net/browse/NET-12264	2025-03-07 10:04:59 -05:00
James Rasell	c0eccda4f7	template: Set any Consul token generated by workload identity. (#25309 )	2025-03-07 14:32:02 +00:00
Michael Smithhisler	5c4d0e923d	consul: Remove legacy token based authentication workflow (#25217 )	2025-03-05 15:38:11 -05:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
Tim Gross	7b89c0ee28	template: fix client's default retry configuration (#25113 ) In #20165 we fixed a bug where a partially configured `client.template` retry block would set any unset fields to nil instead of their default values. But this patch introduced a regression in the default values, so we were now defaulting to unlimited retries if the retry block was unset. Restore the correct behavior and add better test coverage at both the config parsing and template configuration code. Ref: https://github.com/hashicorp/nomad/pull/20165 Ref: https://github.com/hashicorp/nomad/issues/23305#issuecomment-2643731565	2025-02-14 09:25:41 -05:00
Matt Keeler	833e240597	Upgrade to using hashicorp/go-metrics@v0.5.4 (#24856 ) * Upgrade to using hashicorp/go-metrics@v0.5.4 This also requires bumping the dependencies for: * memberlist * serf * raft * raft-boltdb * (and indirectly hashicorp/mdns due to the memberlist or serf update) Unlike some other HashiCorp products, Nomads root module is currently expected to be consumed by others. This means that it needs to be treated more like our libraries and upgrade to hashicorp/go-metrics by utilizing its compat packages. This allows those importing the root module to control the metrics module used via build tags.	2025-01-31 15:22:00 -05:00
Michael Smithhisler	47c14ddf28	remove remote task execution code (#24909 )	2025-01-29 08:08:34 -05:00
Gabi	e107d84c78	taskrunner: fix panic when a task that has a dynamic user is recovered (#24739 )	2025-01-27 13:05:55 -05:00
James Rasell	7d48aa2667	client: emit optional telemetry from prerun and prestart hooks. (#24556 ) The Nomad client can now optionally emit telemetry data from the prerun and prestart hooks. This allows operators to monitor and alert on failures and time taken to complete. The new datapoints are: - nomad.client.alloc_hook.prerun.success (counter) - nomad.client.alloc_hook.prerun.failed (counter) - nomad.client.alloc_hook.prerun.elapsed (sample) - nomad.client.task_hook.prestart.success (counter) - nomad.client.task_hook.prestart.failed (counter) - nomad.client.task_hook.prestart.elapsed (sample) The hook execution time is useful to Nomad engineering and will help optimize code where possible and understand job specification impacts on hook performance. Currently only the PreRun and PreStart hooks have telemetry enabled, so we limit the number of new metrics being produced.	2024-12-12 14:43:14 +00:00
Piotr Kazmierczak	3a18f22c18	goflags: go:build linux for tests that won't compile on other platforms (#24559 ) I'm a heavy LSP user and I frequently goto:next_error. This confuses my editor on macOS.	2024-11-28 15:05:00 +01:00
James Rasell	beb4097e81	client: mark the remote_task hook as deprecated. (#24505 )	2024-11-20 15:32:50 +00:00
Tim Gross	a420732424	consul: allow non-root Nomad to rewrite token (#24410 ) When a task restarts, the Nomad client may need to rewrite the Consul token, but it's created with permissions that prevent a non-root agent from writing to it. While Nomad clients should be run as root (currently), it's harmless to allow whatever user the Nomad agent is running as to be able to write to it, and that's one less barrier to rootless Nomad. Ref: https://github.com/hashicorp/nomad/issues/23859#issuecomment-2465757392	2024-11-19 10:21:14 -05:00
Michael Smithhisler	0714353324	fix: handle template re-renders on client restart (#24399 ) When multiple templates with api functions are included in a task, it's possible for consul-template to re-render templates as it creates watchers, overwriting render event data. This change uses event fields that do not get overwritten, and only executes the change mode for templates that were actually written to disk. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-11-08 12:49:38 -05:00
Seth Hoenig	f1ce127524	jobspec: add a chown option to artifact block (#24157 ) * jobspec: add a chown option to artifact block This PR adds a boolean 'chown' field to the artifact block. It indicates whether the Nomad client should chown the downloaded files and directories to be owned by the task.user. This is useful for drivers like raw_exec and exec2 which are subject to the host filesystem user permissions structure. Before, these drivers might not be able to use or manage the downloaded artifacts since they would be owned by the root user on a typical Nomad client configuration. * api: no need for pointer of chown field	2024-10-11 11:30:27 -05:00
Martijn Vegter	3ecf0d21e2	metrics: introduce client config to include alloc metadata as part of the base labels (#23964 )	2024-10-02 10:55:44 -04:00
Tim Gross	cc9227b858	template: fix panic in change_mode=script on client restart (#24057 ) When we introduced change_mode=script to templates, we passed the driver handle down into the template manager so we could call its `Exec` method directly. But the lifecycle of the driver handle is managed by the taskrunner and isn't available when the template manager is first created. This has led to a series of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part of the challenge in getting this right is using an interface to avoid the circular import of the driver handle. But the taskrunner already has a way to deal with this problem using a "lazy handle". The other template change modes already use this indirectly through the `Lifecycle` interface. Change the driver handle `Exec` call in the template manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This eliminates the need for the template manager to know anything at all about the handle state. Fixes: https://github.com/hashicorp/nomad/issues/24051	2024-09-25 08:59:01 -04:00
Michael Smithhisler	6b6aa7cc26	identity: adds ability to specify custom filepath for saving workload identities (#24038 )	2024-09-23 10:27:00 -04:00
Tim Gross	07aca67108	template: lock task handle before trying script check (#23917 ) In #23663 we fixed the template hook so that `change_mode="script"` didn't lose track of the task handle during restores. But this revealed a second bug which is that access to the handle is not locked while in use, which can allow it to be removed concurrently. Fixes: https://github.com/hashicorp/nomad/issues/23875	2024-09-12 08:41:06 -04:00
Tim Gross	b25f1b66ce	resources: allow job authors to configure size of secrets tmpfs (#23696 ) On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks need larger space for downloading large secrets. This is especially the case for tasks using `templates`, which need extra room to write a temporary file to the secrets directory that gets renamed to the old file atomically. This changeset allows increasing the size of the tmpfs in the `resources` block. Because this is a memory resource, we need to include it in the memory we allocate for scheduling purposes. The task is already prevented from using more memory in the tmpfs than the `resources.memory` field allows, but can bypass that limit by writing to the tmpfs via `template` or `artifact` blocks. Therefore, we need to account for the size of the tmpfs in the allocation resources. Simply adding it to the memory needed when we create the allocation allows it to be accounted for in all downstream consumers, and then we'll subtract that amount from the memory resources just before configuring the task driver. For backwards compatibility, the default value of 1MiB is "free" and ignored by the scheduler. Otherwise we'd be increasing the allocated resources for every existing alloc, which could cause problems across upgrades. If a user explicitly sets `resources.secrets = 1` it will no longer be free. Fixes: https://github.com/hashicorp/nomad/issues/2481 Ref: https://hashicorp.atlassian.net/browse/NET-10070	2024-08-05 16:06:58 -04:00
Tim Gross	c280891703	template: allow change_mode script to run after client restart (#23663 ) For templates with `change_mode = "script"`, we set a driver handle in the poststart method, so the template runner can execute the script inside the task. But when the client is restarted and the template contents change during that window, we trigger a change_mode in the prestart method. In that case, the hook will not have the handle and so returns an errror trying to run the change mode. We restore the driver handle before we call any prestart hooks, so we can pass that handle in the constructor whenever it's available. In the normal task start case the handle will be empty but also won't be called. The error messages are also misleading, as there's no capabilities check happening here. Update the error messages to match. Fixes: https://github.com/hashicorp/nomad/issues/15851 Ref: https://hashicorp.atlassian.net/browse/NET-9338	2024-07-24 08:29:39 -04:00
Piotr Kazmierczak	356ea87e00	template: disable sandboxed rendering on Windows (#23432 ) Following #23443, we no longer need to sandbox template rendering on Windows.	2024-06-28 17:16:27 +02:00
Tim Gross	df67e74615	Consul: add preflight checks for Envoy bootstrap (#23381 ) Nomad creates Consul ACL tokens and service registrations to support Consul service mesh workloads, before bootstrapping the Envoy proxy. Nomad always talks to the local Consul agent and never directly to the Consul servers. But the local Consul agent talks to the Consul servers in stale consistency mode to reduce load on the servers. This can result in the Nomad client making the Envoy bootstrap request with a tokens or services that have not yet replicated to the follower that the local client is connected to. This request gets a 404 on the ACL token and that negative entry gets cached, preventing any retries from succeeding. To workaround this, we'll use a method described by our friends over on `consul-k8s` where after creating the objects in Consul we try to read them from the local agent in stale consistency mode (which prevents a failed read from being cached). This cannot completely eliminate this source of error because it's possible that Consul cluster replication is unhealthy at the time we need it, but this should make Envoy bootstrap significantly more robust. This changset adds preflight checks for the objects we create in Consul: * We add a preflight check for ACL tokens after we login via via Workload Identity and in the function we use to derive tokens in the legacy workflow. We do this check early because we also want to use this token for registering group services in the allocrunner hooks. * We add a preflight check for services right before we bootstrap Envoy in the taskrunner hook, so that we have time for our service client to batch updates to the local Consul agent in addition to the local agent sync. We've added the timeouts to be configurable via node metadata rather than the usual static configuration because for most cases, users should not need to touch or even know these values are configurable; the configuration is mostly available for testing. Fixes: https://github.com/hashicorp/nomad/issues/9307 Fixes: https://github.com/hashicorp/nomad/issues/10451 Fixes: https://github.com/hashicorp/nomad/issues/20516 Ref: https://github.com/hashicorp/consul-k8s/pull/887 Ref: https://hashicorp.atlassian.net/browse/NET-10051 Ref: https://hashicorp.atlassian.net/browse/NET-9273 Follow-up: https://hashicorp.atlassian.net/browse/NET-10138	2024-06-27 10:15:37 -04:00
Daniel Bennett	4415fabe7d	jobspec: time based task execution (#22201 ) this is the CE side of an Enterprise-only feature. a job trying to use this in CE will fail to validate. to enable daily-scheduled execution entirely client-side, a job may now contain: task "name" { schedule { cron { start = "0 12 * * * *" # may not include "," or "/" end = "0 16" # partial cron, with only {minute} {hour} timezone = "EST" # anything in your tzdb } } ... and everything about the allocation will be placed as usual, but if outside the specified schedule, the taskrunner will block on the client, waiting on the schedule start, before proceeding with the task driver execution, etc. this includes a taksrunner hook, which watches for the end of the schedule, at which point it will kill the task. then, restarts-allowing, a new task will start and again block waiting for start, and so on. this also includes all the plumbing required to pipe API calls through from command->api->agent->server->client, so that tasks can be force-run, force-paused, or resume the schedule on demand.	2024-05-22 15:40:25 -05:00
James Rasell	04ba358266	client: expose network namespace CNI config as task env vars. (#11810 ) This change exposes CNI configuration details of a network namespace as environment variables. This allows a task to use these value to configure itself; a potential use case is to run a Raft application binding to IP and Port details configured using the bridge network mode.	2024-05-14 09:02:06 +01:00
Seth Hoenig	14a022cbc0	drivers/raw_exec: enable setting cgroup override values (#20481 ) * drivers/raw_exec: enable setting cgroup override values This PR enables configuration of cgroup override values on the `raw_exec` task driver. WARNING: setting cgroup override values eliminates any gauruntee Nomad can make about resource availability for any task on the client node. For cgroup v2 systems, set a single unified cgroup path using `cgroup_v2_override`. The path may be either absolute or relative to the cgroup root. config { cgroup_v2_override = "custom.slice/app.scope" } or config { cgroup_v2_override = "/sys/fs/cgroup/custom.slice/app.scope" } For cgroup v1 systems, set a per-controller path for each controller using `cgroup_v1_override`. The path(s) may be either absolute or relative to the controller root. config { cgroup_v1_override = { "pids": "custom/app", "cpuset": "custom/app", } } or config { cgroup_v1_override = { "pids": "/sys/fs/cgroup/pids/custom/app", "cpuset": "/sys/fs/cgroup/cpuset/custom/app", } } * drivers/rawexec: ensure only one of v1/v2 cgroup override is set * drivers/raw_exec: executor should error if setting cgroup does not work * drivers/raw_exec: create cgroups in raw_exec tests * drivers/raw_exec: ensure we fail to start if custom cgroup set and non-root * move custom cgroup func into shared file --------- Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2024-05-07 16:46:27 -07:00
Tim Gross	f41bc468eb	consul: provide `CONSUL_HTTP_TOKEN` env var to tasks (#20519 ) When available, we provide an environment variable `CONSUL_TOKEN` to tasks, but this isn't the environment variable expected by the Consul CLI. Job specifications like deploying an API Gateway become noticeably nicer if we can instead provide the expected env var.	2024-05-03 11:30:33 -04:00
Seth Hoenig	5f64e42d73	client: fixup how alloc mounts directory are setup (#20463 )	2024-04-26 07:29:52 -05:00
Tim Gross	6d58acd897	WI: ensure tasks within same alloc get different Consul tokens (#20411 ) The `consul_hook` in the allocrunner gets a separate Consul token for each task, even if the tasks' identities have the same name, but used the identity name as the key to the alloc hook resources map. This means the last task in the group overwrites the Consul tokens of all other tasks. Fix this by adding the task name to the key in the allocrunner's `consul_hook`. And update the taskrunner's `consul_hook` to expect the task name in the key. Fixes: https://github.com/hashicorp/nomad/issues/20374 Fixes: https://hashicorp.atlassian.net/browse/NOMAD-614	2024-04-17 11:29:58 -04:00
Seth Hoenig	ae6c4c8e3f	deps: purge use of old x/exp packages (#20373 )	2024-04-12 08:29:00 -05:00
astudentofblake	7b7ed12326	func: Allow custom paths to be added the the getter landlock (#20349 ) * func: Allow custom paths to be added the the getter landlock Fixes: 20315 * fix: slices imports fix: more meaningful examples fix: improve documentation fix: quote error output	2024-04-11 15:17:33 -05:00
Tim Gross	d56e8ad1aa	WI: ensure Consul hook and WID manager interpolate services (#20344 ) Services can have some of their string fields interpolated. The new Workload Identity flow doesn't interpolate the services before requesting signed identities or using those identities to get Consul tokens. Add support for interpolation to the WID manager and the Consul tokens hook by providing both with a taskenv builder. Add an "interpolate workload" field to the WI handle to allow passing the original workload name to the server so the server can find the correct service to sign. This changeset also makes two related test improvements: * Remove the mock WID manager, which was only used in the Consul hook tests and isn't necessary so long as we provide the real WID manager with the mock signer and never call `Run` on it. It wasn't feasible to exercise the correct behavior without this refactor, as the mocks were bypassing the new code. * Fixed swapped expect-vs-actual assertions on the `consul_hook` tests. Fixes: https://github.com/hashicorp/nomad/issues/20025	2024-04-11 15:40:28 -04:00
Tim Gross	15162917c1	cni: fix regression in falling back to DNS owned by `dockerd` (#20189 ) In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: https://github.com/hashicorp/nomad/issues/20174	2024-03-22 10:54:16 -04:00
Tim Gross	13617eee4b	template: improve internal documentation around shutdown (#20134 ) While investigating a report around possible consul-template shutdown issues, which didn't bear fruit, I found that some of the logic around template runner shutdown is unintuitive. * Add some doc strings to the places where someone might think we should be obviously stopping the runner or returning early. * Mark context argument for `Poststart`, `Stop`, and `Update` hooks as unused. No functional code changes.	2024-03-14 15:33:32 -04:00
Amir Abbas	40b8f17717	Support insecure flag on artifact (#20126 )	2024-03-14 10:59:20 -05:00
Seth Hoenig	bb54d16e4a	exec2: setup RPC plumbing for dynamic workload users (#20129 ) And pass the dynamic users pool from the client into the hook.	2024-03-13 14:06:52 -05:00
Seth Hoenig	05937ab75b	exec2: add client support for unveil filesystem isolation mode (#20115 ) * exec2: add client support for unveil filesystem isolation mode This PR adds support for a new filesystem isolation mode, "Unveil". The mode introduces a "alloc_mounts" directory where tasks have user-owned directory structure which are bind mounts into the real alloc directory structure. This enables a task driver to use landlock (and maybe the real unveil on openbsd one day) to isolate a task to the task owned directory structure, providing sandboxing. * actually create alloc-mounts-dir directory * fix doc strings about alloc mount dir paths	2024-03-13 08:24:17 -05:00
Seth Hoenig	67554b8f91	exec2: implement dynamic workload users taskrunner hook (#20069 ) * exec2: implement dynamic workload users taskrunner hook This PR impelements a TR hook for allocating dynamic workload users from a pool managed by the Nomad client. This adds a new task driver Capability, DynamicWorkloadUsers - which a task driver must indicate in order to make use of this feature. The client config plumbing is coming in a followup PR - in the RFC we realized having a client.users block would be nice to have, with some additional unrelated options being moved from the deprecated client.options config. * learn to spell	2024-03-06 09:34:27 -06:00
Tim Gross	45b2c34532	cni: add DNS set by CNI plugins to task configuration (#20007 ) CNI plugins may set DNS configuration, but this isn't threaded through to the task configuration so that we can write it to the `/etc/resolv.conf` file as needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're accessible from the taskrunner. Any DNS entries provided by the user will override these values. Fixes: https://github.com/hashicorp/nomad/issues/11102	2024-02-20 10:17:27 -05:00
Tim Gross	df86503349	template: sandbox template rendering The Nomad client renders templates in the same privileged process used for most other client operations. During internal testing, we discovered that a malicious task can create a symlink that can cause template rendering to read and write to arbitrary files outside the allocation sandbox. Because the Nomad agent can be restarted without restarting tasks, we can't simply check that the path is safe at the time we write without encountering a time-of-check/time-of-use race. To protect Nomad client hosts from this attack, we'll now read and write templates in a subprocess: * On Linux/Unix, this subprocess is sandboxed via chroot to the allocation directory. This requires that Nomad is running as a privileged process. A non-root Nomad agent will warn that it cannot sandbox the template renderer. * On Windows, this process is sandboxed via a Windows AppContainer which has been granted access to only to the allocation directory. This does not require special privileges on Windows. (Creating symlinks in the first place can be prevented by running workloads as non-Administrator or non-ContainerAdministrator users.) Both sandboxes cause encountered symlinks to be evaluated in the context of the sandbox, which will result in a "file not found" or "access denied" error, depending on the platform. This change will also require an update to Consul-Template to allow callers to inject a custom `ReaderFunc` and `RenderFunc`. This design is intended as a workaround to allow us to fix this bug without creating backwards compatibility issues for running tasks. A future version of Nomad may introduce a read-only mount specifically for templates and artifacts so that tasks cannot write into the same location that the Nomad agent is. Fixes: https://github.com/hashicorp/nomad/issues/19888 Fixes: CVE-2024-1329	2024-02-08 10:40:24 -05:00
Juana De La Cuesta	120c3ca3c9	Add granular control of SELinux labels for host mounts (#19839 ) Add new configuration option on task's volume_mounts, to give a fine grained control over SELinux "z" label * Update website/content/docs/job-specification/volume_mount.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * fix: typo * func: make volume mount verification happen even on mounts with no volume --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-02-05 10:05:33 +01:00
Tim Gross	334c383eb6	template: run template tests on Windows where possible (#19856 ) We don't run the whole suite of unit tests on all platforms to keep CI times reasonable, so the only things we've been running on Windows are platform-specific. I'm working on some platform-specific `template` related work and having these tests run on Windows will reduce the risk of regressions. Our Windows CI box doesn't have Consul or Vault, so I've skipped those tests for the time being, and can follow up with that later. There's also a test with assertions looking for specific paths, and the results are different on Windows. I've skipped those for the moment as well and will follow up under a separate PR. Also swap `testify` for `shoenig/test`	2024-02-02 09:22:03 -05:00
Michael Schurter	8f564182ef	connect: rewrite envoy bootstrap on every restart (#19787 ) Fixes #19781 Do not mark the envoy bootstrap hook as done after successfully running once. Since the bootstrap file is written to /secrets, which is a tmpfs on supported platforms, it is not persisted across reboots. This causes the task and allocation to fail on reboot (see #19781). This fixes it by always rewriting the envoy bootstrap file every time the Nomad agent starts. This does mean we may write a new bootstrap file to an already running Envoy task, but in my testing that doesn't have any impact. This commit doesn't necessarily fix every use of Done by hooks, but hopefully improves the situation. The comment on Done has been expanded to hopefully avoid misuse in the future. Done assertions were removed from tests as they add more noise than value. Alternative 1: Use a regular file An alternative approach would be to write the bootstrap file somewhere other than the tmpfs, but this is unsafe as when Consul ACLs are enabled the file will contain a secret token: https://developer.hashicorp.com/consul/commands/connect/envoy#bootstrap Alternative 2: Detect if file is already written An alternative approach would be to detect if the bootstrap file exists, and only write it if it doesn't. This is just a more complicated form of the current fix. I think in general in the absence of other factors task hooks should be idempotent and therefore able to rerun on any agent startup. This simplifies the code and our ability to reason about task restarts vs agent restarts vs node reboots by making them all take the same code path.	2024-01-24 11:26:31 -08:00
Tim Gross	0935f443dc	vault: support allowing tokens to expire without refresh (#19691 ) Some users with batch workloads or short-lived prestart tasks want to derive a Vaul token, use it, and then allow it to expire without requiring a constant refresh. Add the `vault.allow_token_expiration` field, which works only with the Workload Identity workflow and not the legacy workflow. When set to true, this disables the client's renewal loop in the `vault_hook`. When Vault revokes the token lease, the token will no longer be valid. The client will also now automatically detect if the Vault auth configuration does not allow renewals and will disable the renewal loop automatically. Note this should only be used when a secret is requested from Vault once at the start of a task or in a short-lived prestart task. Long-running tasks should never set `allow_token_expiration=true` if they obtain Vault secrets via `template` blocks, as the Vault token will expire and the template runner will continue to make failing requests to Vault until the `vault_retry` attempts are exhausted. Fixes: https://github.com/hashicorp/nomad/issues/8690	2024-01-10 14:49:02 -05:00

1 2 3 4 5 ...

617 Commits