nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-05 09:55:44 +03:00

Author	SHA1	Message	Date
Juanadelacuesta	445b19ce3e	docs: update func docs	2024-10-30 12:35:06 +01:00
Juanadelacuesta	f707a02f4d	fix: update test to force recreation of idvalidator	2024-10-30 12:28:59 +01:00
Juanadelacuesta	bba0407250	style: remove unused code and duplicated test	2024-10-30 11:43:04 +01:00
Juanadelacuesta	3fa2717195	style: remove unused code	2024-10-30 11:36:25 +01:00
Juanadelacuesta	a491ceff5f	fix: put back MSL license header	2024-10-30 11:25:27 +01:00
Juanadelacuesta	e1a0c7cb43	fix: move exclusive unix test back from driver tests	2024-10-30 11:22:41 +01:00
Juanadelacuesta	9a6d2648c8	style: improve debug logging	2024-10-30 11:21:51 +01:00
Juanadelacuesta	2b9bb7a289	license: change missing file to BUSL	2024-10-30 10:24:35 +01:00
Juanadelacuesta	1751b618e4	func: Add conditional to validation init, to allow for easy testing	2024-10-29 16:45:33 +01:00
Juanadelacuesta	a9a452341c	license: update headers to BUSL	2024-10-29 15:54:09 +01:00
Juanadelacuesta	0227788e22	fix: update tests configuration	2024-10-29 15:24:12 +01:00
Juanadelacuesta	0cd1b5ff13	func: move the validation to a dependency and use id sets	2024-10-28 18:59:51 +01:00
Juanadelacuesta	65be613be9	fix: rename test to avoid conflict	2024-10-28 12:17:57 +01:00
Juanadelacuesta	d77dc7dfa4	style: format	2024-10-28 11:46:51 +01:00
Juanadelacuesta	ed04b1bf64	style: remove print	2024-10-28 11:35:03 +01:00
Mike Nomitch	fd7e81dbce	Fixing accidental move of helper fn to unix only validators file	2024-10-28 11:15:41 +01:00
Mike Nomitch	c4f2a41da6	Splitting validators unix functions into own file	2024-10-28 11:15:41 +01:00
Mike Nomitch	ff5ab3776c	Tweaking user lookup code	2024-10-28 11:15:41 +01:00
Mike Nomitch	e1c226e633	Restructuring IDRange	2024-10-28 11:15:41 +01:00
Mike Nomitch	0fbf592131	moving user out of validators	2024-10-28 11:15:41 +01:00
Mike Nomitch	916af5a948	Moving idrange struct location	2024-10-28 11:15:41 +01:00
Mike Nomitch	9565dde138	Only parsing id ranges once	2024-10-28 11:15:41 +01:00
Mike Nomitch	d0049b1e63	Fixed error in denied_uids spec	2024-10-28 11:15:41 +01:00
Mike Nomitch	6b6a1b5bc4	Fixed windows build error	2024-10-28 11:15:41 +01:00
Mike Nomitch	cf36509474	Removing unnecessary int conversion	2024-10-28 11:15:40 +01:00
Mike Nomitch	9cc3992ca6	Adds ability to restrict uid and gids in exec and raw_exec	2024-10-28 11:15:37 +01:00
Seth Hoenig	b539b54c9e	docker: close hijacked write connection when exec ends (#24244 )	2024-10-17 11:41:29 -05:00
Seth Hoenig	b18851617f	docker: close response connection once stdin is exhausted (#24202 )	2024-10-17 11:07:23 -05:00
Piotr Kazmierczak	1ac14f4869	docker: always use API version negotiation when initializing clients (#24237 ) During a refactoring of the docker driver in #23966 we introduced a bug: API version negotiation option was not passed to every new client call.	2024-10-17 15:23:14 +02:00
Tim Gross	d12128c380	docker: use streaming stats collection to correct CPU stats (#24229 ) In #23966 we switched to the official Docker SDK for the `docker` driver. In the process we refactored code around stats collection to use the "one shot" version of stats. Unfortunately this "one shot" stats collection does not include the `PreCPU` stats, which are the stats from the previous read. This breaks the calculation we use to determine CPU ticks, because now we're subtracting 0 from the current value to get the delta. Switch back to using the streaming stats collection. Add a test that fully exercises the `TaskStats` API. Fixes: https://github.com/hashicorp/nomad/issues/24224 Ref: https://hashicorp.atlassian.net/browse/NET-11348	2024-10-17 08:25:59 -04:00
Piotr Kazmierczak	f9cbaaf6c7	docker: fix a bug where auth for private registries wasn't parsed correctly (#24215 ) In #23966 we introduced an official Docker client and did not notice that in contrast to our previous 3rd party client, the official SDK PullOptions object expects a base64 encoded JSON with username and password, instead of username/ password pair.	2024-10-16 22:04:54 +02:00
Tim Gross	6b8ddff1fa	windows: set job object for executor and children (#24214 ) On Windows, if the `raw_exec` driver's executor exits, the child processes are not also killed. Create a Windows "job object" (not to be confused with a Nomad job) and add the executor to it. Child processes of the executor will inherit the job automatically. When the handle to the job object is freed (on executor exit), the job itself is destroyed and this causes all processes in that job to exit. Fixes: https://github.com/hashicorp/nomad/issues/23668 Ref: https://learn.microsoft.com/en-us/windows/win32/procthread/job-objects	2024-10-16 09:20:26 -04:00
Tim Gross	fec91d1dc8	windows: trade heap for stack to build process tree for stats in linear space (#24182 ) In #20619 we overhauled how we were gathering stats for Windows processes. Unlike in Linux where we can ask for processes in a cgroup, on Windows we have to make a single expensive syscall to get all the processes and then build the tree ourselves. Our algorithm to do so is recursive and quadratic in both steps and space with the number of processes on the host. For busy hosts this hits the stack limit and panics the Nomad client. We already build a map of parent PID to PID, so modify this to be a map of parent PID to slice of children and then traverse that tree only from the root we care about (the executor PID). This moves the allocations to the heap but makes the stats gathering linear in steps and space required. This changeset also moves as much of this code as possible into an area not conditionally-compiled by OS, as the tagged test file was not being run in CI. Fixes: https://github.com/hashicorp/nomad/issues/23984	2024-10-14 11:26:38 -04:00
Tim Gross	e9ba630639	docker: fix script check execution (#24098 ) In #24095 we made a fix for non-streaming exec into Docker tasks for script checks and `change_mode = "script"`, but didn't complete E2E testing. We need to use `ContainerExecAttach` in the new API in order to get stdout/stderr from tasklets, but the previous `ContainerExecStart` call will prevent this from running successfully with an error that the exec has already run. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618) * This has shipped in Nomad 1.9.0-beta.1 but not production yet. * This should fix the remaining issues in nightly E2E for Docker.	2024-10-01 16:41:38 -04:00
Tim Gross	7a88d5d626	docker: fix non-streaming exec attachment (#24095 ) In ##23966 when we switched to using the official Docker SDK client, this included new API calls for attaching to the "exec objects" created for running processes inside a running Docker task. When we updated the API for the non-streaming cases (script health checks, and `change_mode = "script"`), we used the container ID and not the exec object ID. These IDs aren't identical because you can have multiple exec objects for a given container. This results in errors like "unable to upgrade to tcp, received 404" because the Docker API can't find the exec object with the container ID. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618) * This has shipped in Nomad 1.9.0-beta.1 but not production yet.	2024-10-01 11:27:13 -04:00
Tim Gross	bf0a65f2d6	docker: reset timer after collecting stats (#24092 ) In ##23966 when we switched to using the official Docker SDK client, we had to rework the stats collection loop for the new client. But we missed resetting the timer on the collection loop, which meant that we'd only collect stats once and then never again. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=550814) * This has shipped in Nomad 1.9.0-beta.1 but not production yet.	2024-10-01 08:31:03 -04:00
Tim Gross	154aeb77af	docker: fix bug in waiting for container to exit (#24081 ) In ##23966 when we switched to using the official Docker SDK client, we had more contexts to add because most of the library methods take one. But for some APIs like waiting for a container to exit after we've started it, we never want to close this context, because the operation can outlive the Nomad agent itself.	2024-09-30 08:50:07 -04:00
Piotr Kazmierczak	ec42aa2a1b	docker: use docker errdefs instead of string comparisons when checking errors (#24075 )	2024-09-27 15:32:29 +02:00
Piotr Kazmierczak	981ca36049	docker: use official client instead of fsouza/go-dockerclient (#23966 ) This PR replaces fsouza/go-dockerclient 3rd party docker client library with docker's official SDK. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2024-09-26 18:41:44 +02:00
Seth Hoenig	51215bf102	deps: update to go-set/v3 and refactor to use custom iterators (#23971 ) * deps: update to go-set/v3 * deps: use custom set iterators for looping	2024-09-16 13:40:10 -05:00
Tim Gross	192d70cee7	docker: update infra_image to new registry (#23927 ) The gcr.io container registry is shutting down in March. Update the default `image_image` for Docker's "pause" containers to point to the new location hosted by the k8s project. Fixes: https://github.com/hashicorp/nomad/issues/23911 Ref: https://hashicorp.atlassian.net/browse/NET-10942	2024-09-06 14:34:03 -04:00
Tim Gross	6aa503f2bb	docker: disable cpuset management for non-root clients (#23804 ) Nomad clients manage a cpuset cgroup for each task to reserve or share CPU cores. But Docker owns its own cgroups, and attempting to set a parent cgroup that Nomad manages runs into conflicts with how runc manages cgroups via systemd. Therefore Nomad must run as root in order for cpuset management to ever be compatible with Docker. However, some users running in unsupported configurations felt that the changes we made in Nomad 1.7.0 to ensure Nomad was running correctly represented a regression. This changeset disables cpuset management for non-root Nomad clients. When running Nomad as non-root, the driver will not longer reconcile cpusets with Nomad and `resources.cores` will behave incorrectly (but the driver will still run). Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. This PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: https://github.com/hashicorp/nomad/issues/18211 Ref: https://github.com/hashicorp/nomad/issues/13669 Ref: https://hashicorp.atlassian.net/browse/NET-10652 Ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md	2024-08-14 16:44:13 -04:00
Tim Gross	9543e740af	docker: fix delimiter for selinux label for read-only volumes (#23750 ) The Docker driver's `volume` field to specify bind-mounts takes a list of strings that consist of three `:`-delimited fields: source, destination, and options. We append the SELinux label from the plugin configuration as the third field. But when the user has already specified the volume is read-only with `:ro`, we're incorrectly appending the SELinux label with another `:` instead of the required `,`. Combine the options into a single field value before appending them to the bind mounts configuration. Updated the tests to split out Windows behavior (which doesn't accept options) and to ensure the test task has the expected environment for bind mounts. Fixes: https://github.com/hashicorp/nomad/issues/23690	2024-08-08 09:08:01 -04:00
Tim Gross	b25f1b66ce	resources: allow job authors to configure size of secrets tmpfs (#23696 ) On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks need larger space for downloading large secrets. This is especially the case for tasks using `templates`, which need extra room to write a temporary file to the secrets directory that gets renamed to the old file atomically. This changeset allows increasing the size of the tmpfs in the `resources` block. Because this is a memory resource, we need to include it in the memory we allocate for scheduling purposes. The task is already prevented from using more memory in the tmpfs than the `resources.memory` field allows, but can bypass that limit by writing to the tmpfs via `template` or `artifact` blocks. Therefore, we need to account for the size of the tmpfs in the allocation resources. Simply adding it to the memory needed when we create the allocation allows it to be accounted for in all downstream consumers, and then we'll subtract that amount from the memory resources just before configuring the task driver. For backwards compatibility, the default value of 1MiB is "free" and ignored by the scheduler. Otherwise we'd be increasing the allocated resources for every existing alloc, which could cause problems across upgrades. If a user explicitly sets `resources.secrets = 1` it will no longer be free. Fixes: https://github.com/hashicorp/nomad/issues/2481 Ref: https://hashicorp.atlassian.net/browse/NET-10070	2024-08-05 16:06:58 -04:00
Piotr Kazmierczak	f22ce921cd	docker: adjust capabilities on Windows (#23599 ) Adjusts Docker capabilities per OS, and checks for runtime on Windows. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-07-17 09:01:45 +02:00
Tim Gross	eedbd36fef	qemu: pass task resources into driver for cgroup setup (#23466 ) As part of the work for 1.7.0 we moved portions of the task cgroup setup down into the executor. This requires that the executor constructor get the `TaskConfig.Resources` struct, and this was missing from the `qemu` driver. We fixed a panic caused by this change in #19089 before we shipped, but this fix was effectively undo after we added plumbing for custom cgroups for `raw_exec` in 1.8.0. As a result, running `qemu` tasks always fail on Linux. This was undetected in testing because our CI environment doesn't have QEMU installed. I've got all the unit tests running locally again and have added QEMU installation when we're running the drivers tests. Fixes: https://github.com/hashicorp/nomad/issues/23250	2024-07-01 11:41:10 -04:00
Piotr Kazmierczak	d5e1515e80	docker: default to hyper-v isolation on Windows (#23452 )	2024-07-01 08:56:43 +02:00
Piotr Kazmierczak	0ece7b5c16	docker: validate that containers do not run as ContainerAdmin on Windows (#23443 ) This enables checks for ContainerAdmin user on docker images on Windows. It's only checked if users run docker with process isolation and not hyper-v, because hyper-v provides its own, proper sandboxing. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-06-27 16:22:24 +02:00
Piotr Kazmierczak	85430be6dd	raw_exec: oom_score_adj support (#23308 )	2024-06-14 11:36:27 +02:00
Luke Palmer	75874136ac	fix cgroup setup for non-default devices (#22518 )	2024-06-13 09:27:19 -04:00

1 2 3 4 5 ...

893 Commits