nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 08:55:43 +03:00

Author	SHA1	Message	Date
tehut	27b1d470a8	modify rawexec TaskConfig and Config to accept envvar denylist (#25511 ) * modify rawexec TaskConfig and Config to accept envvar denylist * update rawexec driver docs to include deniedEnvars options Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-04-02 12:25:28 -07:00
Piotr Kazmierczak	cb8f4ea452	drivers: set -1 exit code in case executor gets killed (#25453 ) Nomad driver handles incorrectly set exit code 0 in case of executor failure. This corrects that behavior. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-20 15:06:39 +01:00
Jorge Marey	25426f0777	fingerprint: add config option to disable dmidecode (#25108 )	2025-02-13 11:20:48 -05:00
Michael Smithhisler	0f97574eae	test: fix rawexec driver unix test imports (#24352 )	2024-11-01 12:10:03 -04:00
Michael Smithhisler	658c429d75	Drivers: add work_dir config to exec/raw_exec/java drivers (#24249 ) --------- Co-authored-by: wurosh <uros.m.perisic@gmail.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-11-01 11:04:40 -04:00
Juanadelacuesta	8752bb0a65	func: move the user lookup into the validation, it's used everywhere the function is called	2024-10-31 10:34:26 +01:00
Juanadelacuesta	3f884bb3fa	fix: remove the setConfig and modify the test driver to include idValidator to avoid panics	2024-10-30 17:38:54 +01:00
Juanadelacuesta	a86e951f03	style: rename DeniedHostGidsStr to reflect refactor	2024-10-30 15:22:50 +01:00
Juanadelacuesta	a90eda628d	func: implement mock validator to avoid changes on the rawexec tests	2024-10-30 15:07:47 +01:00
Juanadelacuesta	088417163b	fix: add set config to populate idValidator on tests	2024-10-30 13:40:19 +01:00
Juanadelacuesta	f707a02f4d	fix: update test to force recreation of idvalidator	2024-10-30 12:28:59 +01:00
Juanadelacuesta	bba0407250	style: remove unused code and duplicated test	2024-10-30 11:43:04 +01:00
Juanadelacuesta	e1a0c7cb43	fix: move exclusive unix test back from driver tests	2024-10-30 11:22:41 +01:00
Juanadelacuesta	1751b618e4	func: Add conditional to validation init, to allow for easy testing	2024-10-29 16:45:33 +01:00
Juanadelacuesta	a9a452341c	license: update headers to BUSL	2024-10-29 15:54:09 +01:00
Juanadelacuesta	0227788e22	fix: update tests configuration	2024-10-29 15:24:12 +01:00
Juanadelacuesta	0cd1b5ff13	func: move the validation to a dependency and use id sets	2024-10-28 18:59:51 +01:00
Juanadelacuesta	65be613be9	fix: rename test to avoid conflict	2024-10-28 12:17:57 +01:00
Juanadelacuesta	d77dc7dfa4	style: format	2024-10-28 11:46:51 +01:00
Juanadelacuesta	ed04b1bf64	style: remove print	2024-10-28 11:35:03 +01:00
Mike Nomitch	ff5ab3776c	Tweaking user lookup code	2024-10-28 11:15:41 +01:00
Mike Nomitch	e1c226e633	Restructuring IDRange	2024-10-28 11:15:41 +01:00
Mike Nomitch	0fbf592131	moving user out of validators	2024-10-28 11:15:41 +01:00
Mike Nomitch	916af5a948	Moving idrange struct location	2024-10-28 11:15:41 +01:00
Mike Nomitch	9565dde138	Only parsing id ranges once	2024-10-28 11:15:41 +01:00
Mike Nomitch	6b6a1b5bc4	Fixed windows build error	2024-10-28 11:15:41 +01:00
Mike Nomitch	9cc3992ca6	Adds ability to restrict uid and gids in exec and raw_exec	2024-10-28 11:15:37 +01:00
Tim Gross	6b8ddff1fa	windows: set job object for executor and children (#24214 ) On Windows, if the `raw_exec` driver's executor exits, the child processes are not also killed. Create a Windows "job object" (not to be confused with a Nomad job) and add the executor to it. Child processes of the executor will inherit the job automatically. When the handle to the job object is freed (on executor exit), the job itself is destroyed and this causes all processes in that job to exit. Fixes: https://github.com/hashicorp/nomad/issues/23668 Ref: https://learn.microsoft.com/en-us/windows/win32/procthread/job-objects	2024-10-16 09:20:26 -04:00
Piotr Kazmierczak	85430be6dd	raw_exec: oom_score_adj support (#23308 )	2024-06-14 11:36:27 +02:00
Seth Hoenig	14a022cbc0	drivers/raw_exec: enable setting cgroup override values (#20481 ) * drivers/raw_exec: enable setting cgroup override values This PR enables configuration of cgroup override values on the `raw_exec` task driver. WARNING: setting cgroup override values eliminates any gauruntee Nomad can make about resource availability for any task on the client node. For cgroup v2 systems, set a single unified cgroup path using `cgroup_v2_override`. The path may be either absolute or relative to the cgroup root. config { cgroup_v2_override = "custom.slice/app.scope" } or config { cgroup_v2_override = "/sys/fs/cgroup/custom.slice/app.scope" } For cgroup v1 systems, set a per-controller path for each controller using `cgroup_v1_override`. The path(s) may be either absolute or relative to the controller root. config { cgroup_v1_override = { "pids": "custom/app", "cpuset": "custom/app", } } or config { cgroup_v1_override = { "pids": "/sys/fs/cgroup/pids/custom/app", "cpuset": "/sys/fs/cgroup/cpuset/custom/app", } } * drivers/rawexec: ensure only one of v1/v2 cgroup override is set * drivers/raw_exec: executor should error if setting cgroup does not work * drivers/raw_exec: create cgroups in raw_exec tests * drivers/raw_exec: ensure we fail to start if custom cgroup set and non-root * move custom cgroup func into shared file --------- Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2024-05-07 16:46:27 -07:00
Seth Hoenig	05937ab75b	exec2: add client support for unveil filesystem isolation mode (#20115 ) * exec2: add client support for unveil filesystem isolation mode This PR adds support for a new filesystem isolation mode, "Unveil". The mode introduces a "alloc_mounts" directory where tasks have user-owned directory structure which are bind mounts into the real alloc directory structure. This enables a task driver to use landlock (and maybe the real unveil on openbsd one day) to isolate a task to the task owned directory structure, providing sandboxing. * actually create alloc-mounts-dir directory * fix doc strings about alloc mount dir paths	2024-03-13 08:24:17 -05:00
James Rasell	10324566ae	driver/rawexec: populate OOM killed exit result. (#19829 )	2024-01-29 08:54:52 +00:00
Seth Hoenig	9410c519ff	drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration (#19599 ) * drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration * fix tests	2024-01-11 08:20:15 -06:00
Seth Hoenig	591394fb62	drivers: plumb hardware topology via grpc into drivers (#18504 ) * drivers: plumb hardware topology via grpc into drivers This PR swaps out the temporary use of detecting system hardware manually in each driver for using the Client's detected topology by plumbing the data over gRPC. This ensures that Client configuration is taken to account consistently in all references to system topology. * cr: use enum instead of bool for core grade * cr: fix test slit tables to be possible	2023-09-18 08:58:07 -05:00
Seth Hoenig	2e1974a574	client: refactor cpuset partitioning (#18371 ) * client: refactor cpuset partitioning This PR updates the way Nomad client manages the split between tasks that make use of resources.cpus vs. resources.cores. Previously, each task was explicitly assigned which CPU cores they were able to run on. Every time a task was started or destroyed, all other tasks' cpusets would need to be updated. This was inefficient and would crush the Linux kernel when a client would try to run ~400 or so tasks. Now, we make use of cgroup heirarchy and cpuset inheritence to efficiently manage cpusets. * cr: tweaks for feedback	2023-09-12 09:11:11 -05:00
hashicorp-copywrite[bot]	2d35e32ec9	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:15 -05:00
Seth Hoenig	a4cc76bd3e	numa: enable numa topology detection (#18146 ) * client: refactor cgroups management in client * client: fingerprint numa topology * client: plumb numa and cgroups changes to drivers * client: cleanup task resource accounting * client: numa client and config plumbing * lib: add a stack implementation * tools: remove ec2info tool * plugins: fixup testing for cgroups / numa changes * build: update makefile and package tests and cl	2023-08-10 17:05:30 -05:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Lance Haig	3160c76209	deps: Update ioutil library references to os and io respectively for drivers package (#16331 ) * Update ioutil library references to os and io respectively for drivers package No user facing changes so I assume no change log is required * Fix failing tests	2023-03-08 10:31:09 -06:00
James Rasell	25e7c2ffa4	chore: remove use of "err" a log line context key for errors. (#14433 ) Log lines which include an error should use the full term "error" as the context key. This provides consistency across the codebase and avoids a Go style which operators might not be aware of.	2022-09-01 15:06:10 +02:00
Tim Gross	3671ea6a8f	remove pre-0.9 driver code and related E2E test (#12791 ) This test exercises upgrades between 0.8 and Nomad versions greater than 0.9. We have not supported 0.8.x in a very long time and in any case the test has been marked to skip because the downloader doesn't work.	2022-04-27 09:53:37 -04:00
Seth Hoenig	be7ec8de3e	raw_exec: make raw exec driver work with cgroups v2 This PR adds support for the raw_exec driver on systems with only cgroups v2. The raw exec driver is able to use cgroups to manage processes. This happens only on Linux, when exec_driver is enabled, and the no_cgroups option is not set. The driver uses the freezer controller to freeze processes of a task, issue a sigkill, then unfreeze. Previously the implementation assumed cgroups v1, and now it also supports cgroups v2. There is a bit of refactoring in this PR, but the fundamental design remains the same. Closes #12351 #12348	2022-04-04 16:11:38 -05:00
Seth Hoenig	5da1a31e94	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
Seth Hoenig	b242957990	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Mahmood Ali	6c414cd5f9	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Mahmood Ali	edaa16589b	honor task user when execing into raw_exec task (#9439 ) Fix #9210 . This update the executor so it honors the User when using nomad alloc exec. The bug was that the exec task didn't honor the init command when execing.	2020-11-25 09:34:10 -05:00
Mahmood Ali	bd745fa3e5	raw_exec: don't use cgroups when no_cgroup is set (#9328 ) When raw_exec is configured with [`no_cgroups`](https://www.nomadproject.io/docs/drivers/raw_exec#no_cgroups), raw_exec shouldn't attempt to create a cgroup. Prior to this change, we accidentally always required freezer cgroup to do stats PID tracking. We already have the proper fallback in place for metrics, so only need to ensure that we don't create a cgroup for the task. Fixes https://github.com/hashicorp/nomad/issues/8565	2020-11-11 16:20:34 -05:00
Mahmood Ali	d6c75e301e	cleanup driver eventor goroutines This fixes few cases where driver eventor goroutines are leaked during normal operations, but especially so in tests. This change makes few modifications: First, it switches drivers to use `Context`s to manage shutdown events. Previously, it relied on callers invoking `.Shutdown()` function that is specific to internal drivers only and require casting. Using `Contexts` provide a consistent idiomatic way to manage lifecycle for both internal and external drivers. Also, I discovered few places where we don't clean up a temporary driver instance in the plugin catalog code, where we dispense a driver to inspect and validate the schema config without properly cleaning it up.	2020-05-26 11:04:04 -04:00
Tim Gross	8860b72bc3	volumes: return better error messages for unsupported task drivers (#8030 ) When an allocation runs for a task driver that can't support volume mounts, the mounting will fail in a way that can be hard to understand. With host volumes this usually means failing silently, whereas with CSI the operator gets inscrutable internals exposed in the `nomad alloc status`. This changeset adds a MountConfig field to the task driver Capabilities response. We validate this when the `csi_hook` or `volume_hook` fires and return a user-friendly error. Note that we don't currently have a way to get driver capabilities up to the server, except through attributes. Validating this when the user initially submits the jobspec would be even better than what we're doing here (and could be useful for all our other capabilities), but that's out of scope for this changeset. Also note that the MountConfig enum starts with "supports all" in order to support community plugins in a backwards compatible way, rather than cutting them off from volume mounting unexpectedly.	2020-05-21 09:18:02 -04:00
Mahmood Ali	bdef161e20	changelog and comment	2019-11-19 15:51:08 -05:00

1 2 3

124 Commits