nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-05 01:45:44 +03:00

Author	SHA1	Message	Date
Michael Smithhisler	4c8257d0c7	client: add once mode to template block (#25922 )	2025-05-28 11:45:11 -04:00
Tim Gross	cc9227b858	template: fix panic in change_mode=script on client restart (#24057 ) When we introduced change_mode=script to templates, we passed the driver handle down into the template manager so we could call its `Exec` method directly. But the lifecycle of the driver handle is managed by the taskrunner and isn't available when the template manager is first created. This has led to a series of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part of the challenge in getting this right is using an interface to avoid the circular import of the driver handle. But the taskrunner already has a way to deal with this problem using a "lazy handle". The other template change modes already use this indirectly through the `Lifecycle` interface. Change the driver handle `Exec` call in the template manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This eliminates the need for the template manager to know anything at all about the handle state. Fixes: https://github.com/hashicorp/nomad/issues/24051	2024-09-25 08:59:01 -04:00
Tim Gross	c280891703	template: allow change_mode script to run after client restart (#23663 ) For templates with `change_mode = "script"`, we set a driver handle in the poststart method, so the template runner can execute the script inside the task. But when the client is restarted and the template contents change during that window, we trigger a change_mode in the prestart method. In that case, the hook will not have the handle and so returns an errror trying to run the change mode. We restore the driver handle before we call any prestart hooks, so we can pass that handle in the constructor whenever it's available. In the normal task start case the handle will be empty but also won't be called. The error messages are also misleading, as there's no capabilities check happening here. Update the error messages to match. Fixes: https://github.com/hashicorp/nomad/issues/15851 Ref: https://hashicorp.atlassian.net/browse/NET-9338	2024-07-24 08:29:39 -04:00
Tim Gross	6d58acd897	WI: ensure tasks within same alloc get different Consul tokens (#20411 ) The `consul_hook` in the allocrunner gets a separate Consul token for each task, even if the tasks' identities have the same name, but used the identity name as the key to the alloc hook resources map. This means the last task in the group overwrites the Consul tokens of all other tasks. Fix this by adding the task name to the key in the allocrunner's `consul_hook`. And update the taskrunner's `consul_hook` to expect the task name in the key. Fixes: https://github.com/hashicorp/nomad/issues/20374 Fixes: https://hashicorp.atlassian.net/browse/NOMAD-614	2024-04-17 11:29:58 -04:00
Tim Gross	df86503349	template: sandbox template rendering The Nomad client renders templates in the same privileged process used for most other client operations. During internal testing, we discovered that a malicious task can create a symlink that can cause template rendering to read and write to arbitrary files outside the allocation sandbox. Because the Nomad agent can be restarted without restarting tasks, we can't simply check that the path is safe at the time we write without encountering a time-of-check/time-of-use race. To protect Nomad client hosts from this attack, we'll now read and write templates in a subprocess: * On Linux/Unix, this subprocess is sandboxed via chroot to the allocation directory. This requires that Nomad is running as a privileged process. A non-root Nomad agent will warn that it cannot sandbox the template renderer. * On Windows, this process is sandboxed via a Windows AppContainer which has been granted access to only to the allocation directory. This does not require special privileges on Windows. (Creating symlinks in the first place can be prevented by running workloads as non-Administrator or non-ContainerAdministrator users.) Both sandboxes cause encountered symlinks to be evaluated in the context of the sandbox, which will result in a "file not found" or "access denied" error, depending on the platform. This change will also require an update to Consul-Template to allow callers to inject a custom `ReaderFunc` and `RenderFunc`. This design is intended as a workaround to allow us to fix this bug without creating backwards compatibility issues for running tasks. A future version of Nomad may introduce a read-only mount specifically for templates and artifacts so that tasks cannot write into the same location that the Nomad agent is. Fixes: https://github.com/hashicorp/nomad/issues/19888 Fixes: CVE-2024-1329	2024-02-08 10:40:24 -05:00
Luiz Aoqui	0bc822db40	vault: load default config for tasks without vault (#19439 ) It is often expected that a task that needs access to Vault defines a `vault` block to specify the Vault policy to use to derive a token. But in some scenarios, like when the Nomad client is connected to a local Vault agent that is responsible for authn/authz, the task is not required to defined a `vault` block. In these situations, the `default` Vault cluster should be used to render the template.	2023-12-12 14:06:55 -05:00
Luiz Aoqui	f0acf72ae7	client: fix Consul token retrievel for templates (#19058 ) The template hook must use the Consul token for the cluster defined in the task-level `consul` block or, if `nil, in the group-level `consul` block. The Consul tokens are generated by the allocrunner consul hook, but during the transition period we must fallback to the Nomad agent token if workload identities are not being used. So an empty token returned from `GetConsulTokens()` is not enough to determine if we should use the legacy flow (either this is an old task or the cluster is not configured for Consul WI), or if there is a misconfiguration (task or group is `consul` block is using a cluster that doesn't have an `identity` set). In order to distinguish between the two scenarios we must iterate over the task identities looking for one suitable for the Consul cluster being used.	2023-11-10 13:42:30 -05:00
Tim Gross	c7c3b3ae33	revoke Consul tokens obtained via WI when alloc stops (#19034 ) Add a `Postrun` and `Destroy` hook to the allocrunner's `consul_hook` to ensure that Consul tokens we've created via WI get revoked via the logout API when we're done with them. Also add the logout to the `Prerun` hook if we've hit an error.	2023-11-09 10:08:09 -05:00
Tim Gross	483e78615d	template: fix test assertion to be compatible between CE/ENT (#18957 ) The template hook emits an error when the task has a Consul block that requires WI but there's no WI. The exact error message we get depends on whether we're running in CE or ENT. Update the test assertion so that we can tolerate this difference without building ENT-specific test files.	2023-11-01 13:26:45 -04:00
Tim Gross	dd62e8a319	consul/vault: use accessor method to get cluster name in client (#18955 ) When looking up the Consul or Vault cluster from a client hook, we should always use an accessor function rather than trying to lookup the `Cluster` field, which may be empty for jobs registered before Nomad 1.7.	2023-11-01 10:59:59 -04:00
Luiz Aoqui	349c032369	vault: update task runner vault hook to support workload identity (#18534 )	2023-10-16 19:37:57 -04:00
Piotr Kazmierczak	299f3bf74b	client: use WI-issued consul tokens in the template_hook (#18752 ) ref https://github.com/hashicorp/team-nomad/issues/404	2023-10-16 09:39:20 +02:00

12 Commits