nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 00:45:43 +03:00

Author	SHA1	Message	Date
dependabot[bot]	6a35c1b8ea	chore(deps): bump github.com/docker/docker from 28.1.1+incompatible to 28.2.2+incompatible (#25954 ) * chore(deps): bump github.com/docker/docker Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.1.1+incompatible to 28.2.2+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v28.1.1...v28.2.2) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-version: 28.2.2+incompatible dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * deps: containerd/errdefs instead of docker/errdefs moby's errdefs are deprecated as of `f1bb44aeee` and now merely point to containerd's --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-06-05 10:26:18 -04:00
Tim Gross	374e987b9b	metrics: emit cache and rss stats on cgroup v2 (#25751 ) In cgroups v2, a different map of memory stats is available from the kernel than in v1. The Docker API reflects this change. But there are equivalent values in the map for RSS (anonymously mapped memory) and cache (filesystem cache and tmpfs), which the Docker driver is not currently emitting. Fallback to these alternate values when the cgroups v1 values are not available. Include the anonymous mapping in the "measured" allocation stats as "RSS" so that they both show up in allocation metrics. We can do this on both the `docker` driver and the Linux executor for `exec` and `java` drivers. Fixes: https://github.com/hashicorp/nomad/issues/19185 Ref: https://hashicorp.atlassian.net/browse/NMD-437 Ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files Ref: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt	2025-04-24 12:48:18 -04:00
Tim Gross	c7cb49f205	testing: fix a panic in docker stats collection test (#25747 ) When the context closes, the stats emitter closes its channel. It's possible for the channel to be closed in the stats emitter goroutine before the `select` in the test sees that the context has closed, which can result in a panic in the test when we try to read the empty value off the channel.	2025-04-24 10:41:03 -04:00
Tim Gross	4d7ed88a8d	testing: use Docker Hub registry mirror for additional tests (#25733 ) This image was missed in https://github.com/hashicorp/nomad/pull/25703 and is resulting in rate limited in tests.	2025-04-24 08:50:32 -04:00
Tim Gross	88dc842729	testing: use Docker Hub registry mirror for CI (#25703 ) As of April 1, Docker Hub rate limits tightened. With only 10 pulls/hr/IP, we're likely to encounter test failures. Switch all Docker images getting pulled from this repository to use the HashiCorp managed registry mirror. Note that most of our tests in `drivers/docker` don't pull from the remote registry but load a local image, while others will need to pull from the remote and fetch different images depending on OS/arch. Refactor the definition of test task configuration to make it clear which is which, and de-factor some false sharing of setup functions. Updates the E2E tests to use that registry by configuring the Docker daemon. This required changing out a few container images that we don't have in the registry, but these new images are all smaller. There are a couple of tests that still use explicitly-tagged `docker.io` images or other third-party registries, which have been left in place. Ref: https://hashicorp.atlassian.net/browse/NET-12233 update E2E images to those in the registry mirror fix windows and docklog test build fix stopsignal test mop-up more mop-up	2025-04-18 14:21:49 -04:00
James Rasell	c85c723336	ci: Run core tests groups workflow on amd64 and arm64 runners. (#25695 )	2025-04-17 15:16:29 +01:00
Piotr Kazmierczak	e249a6197f	docker: TestDockerDriver_OOMKilled should now run on cgroups v2 (#25443 ) Docker driver's TestDockerDriver_OOMKilled should run on cgroups v2 now, since we're running docker v27 client library and our runners run docker v26 that contain containerd fix containerd/containerd#6323.	2025-03-19 16:53:37 +01:00
Juana De La Cuesta	5605f9630d	Fix the docker image parser to account for private repos (#24926 ) * fix: fix the docker image parser to account for private repos * style: change the local regex for docker image indentifiers and use docker package instead * func: return early when no repo found on the image name * func: return error if no path found in image * Update drivers/docker/utils.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * Update coordinator.go * Update driver.go * Update network.go --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-04 16:53:20 +01:00
Jorge Marey	25426f0777	fingerprint: add config option to disable dmidecode (#25108 )	2025-02-13 11:20:48 -05:00
Vincent Ducamps	6469b59a0a	docker: Fix a bug where images with port number and no tags weren't parsed correctly	2025-01-03 11:38:43 +01:00
Tim Gross	d12128c380	docker: use streaming stats collection to correct CPU stats (#24229 ) In #23966 we switched to the official Docker SDK for the `docker` driver. In the process we refactored code around stats collection to use the "one shot" version of stats. Unfortunately this "one shot" stats collection does not include the `PreCPU` stats, which are the stats from the previous read. This breaks the calculation we use to determine CPU ticks, because now we're subtracting 0 from the current value to get the delta. Switch back to using the streaming stats collection. Add a test that fully exercises the `TaskStats` API. Fixes: https://github.com/hashicorp/nomad/issues/24224 Ref: https://hashicorp.atlassian.net/browse/NET-11348	2024-10-17 08:25:59 -04:00
Piotr Kazmierczak	f9cbaaf6c7	docker: fix a bug where auth for private registries wasn't parsed correctly (#24215 ) In #23966 we introduced an official Docker client and did not notice that in contrast to our previous 3rd party client, the official SDK PullOptions object expects a base64 encoded JSON with username and password, instead of username/ password pair.	2024-10-16 22:04:54 +02:00
Piotr Kazmierczak	ec42aa2a1b	docker: use docker errdefs instead of string comparisons when checking errors (#24075 )	2024-09-27 15:32:29 +02:00
Piotr Kazmierczak	981ca36049	docker: use official client instead of fsouza/go-dockerclient (#23966 ) This PR replaces fsouza/go-dockerclient 3rd party docker client library with docker's official SDK. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2024-09-26 18:41:44 +02:00
Tim Gross	9543e740af	docker: fix delimiter for selinux label for read-only volumes (#23750 ) The Docker driver's `volume` field to specify bind-mounts takes a list of strings that consist of three `:`-delimited fields: source, destination, and options. We append the SELinux label from the plugin configuration as the third field. But when the user has already specified the volume is read-only with `:ro`, we're incorrectly appending the SELinux label with another `:` instead of the required `,`. Combine the options into a single field value before appending them to the bind mounts configuration. Updated the tests to split out Windows behavior (which doesn't accept options) and to ensure the test task has the expected environment for bind mounts. Fixes: https://github.com/hashicorp/nomad/issues/23690	2024-08-08 09:08:01 -04:00
Piotr Kazmierczak	f22ce921cd	docker: adjust capabilities on Windows (#23599 ) Adjusts Docker capabilities per OS, and checks for runtime on Windows. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-07-17 09:01:45 +02:00
Piotr Kazmierczak	0ece7b5c16	docker: validate that containers do not run as ContainerAdmin on Windows (#23443 ) This enables checks for ContainerAdmin user on docker images on Windows. It's only checked if users run docker with process isolation and not hyper-v, because hyper-v provides its own, proper sandboxing. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-06-27 16:22:24 +02:00
Piotr Kazmierczak	bf11e39ac8	docker: add a unit test for "container already exists" error when creating containers (#22238 )	2024-05-30 11:24:28 +02:00
Seth Hoenig	591394fb62	drivers: plumb hardware topology via grpc into drivers (#18504 ) * drivers: plumb hardware topology via grpc into drivers This PR swaps out the temporary use of detecting system hardware manually in each driver for using the Client's detected topology by plumbing the data over gRPC. This ensures that Client configuration is taken to account consistently in all references to system topology. * cr: use enum instead of bool for core grade * cr: fix test slit tables to be possible	2023-09-18 08:58:07 -05:00
Seth Hoenig	2e1974a574	client: refactor cpuset partitioning (#18371 ) * client: refactor cpuset partitioning This PR updates the way Nomad client manages the split between tasks that make use of resources.cpus vs. resources.cores. Previously, each task was explicitly assigned which CPU cores they were able to run on. Every time a task was started or destroyed, all other tasks' cpusets would need to be updated. This was inefficient and would crush the Linux kernel when a client would try to run ~400 or so tasks. Now, we make use of cgroup heirarchy and cpuset inheritence to efficiently manage cpusets. * cr: tweaks for feedback	2023-09-12 09:11:11 -05:00
James Rasell	a9d5beb141	test: use correct parallel test setup func (#18326 )	2023-08-25 13:51:36 +01:00
hashicorp-copywrite[bot]	2d35e32ec9	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:15 -05:00
Seth Hoenig	a4cc76bd3e	numa: enable numa topology detection (#18146 ) * client: refactor cgroups management in client * client: fingerprint numa topology * client: plumb numa and cgroups changes to drivers * client: cleanup task resource accounting * client: numa client and config plumbing * lib: add a stack implementation * tools: remove ec2info tool * plugins: fixup testing for cgroups / numa changes * build: update makefile and package tests and cl	2023-08-10 17:05:30 -05:00
KamilCuk	da9ec8ce1e	Add group_add docker option (#17313 )	2023-06-02 20:26:01 -04:00
Daniel Bennett	e0dd940439	tests: enable newer windows (#17401 ) * "allow" (don't try to drop) linux capabilities in the docker test driver harness (see #15181) * refactor to allow different busybox images since windows containers need to be the same version as the underlying OS, and we're moving from 2016 to 2019 * one docker test was flaky from apparently being a bit slower on windows, so add Wait()	2023-06-02 11:38:38 -05:00
Tim Gross	30bc456f03	logs: allow disabling log collection in jobspec (#16962 ) Some Nomad users ship application logs out-of-band via syslog. For these users having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow disabling the logmon and pointing the task's stdout/stderr to /dev/null. This changeset is the first of several incremental improvements to log collection short of full-on logging plugins. The next step will likely be to extend the internal-only task driver configuration so that cluster administrators can turn off log collection for the entire driver. --- Fixes: #11175 Co-authored-by: Thomas Weber <towe75@googlemail.com>	2023-04-24 10:00:27 -04:00
Seth Hoenig	74b16da272	deps: update docker to 23.0.3 (#16862 ) * [no ci] deps: update docker to 23.0.3 This PR brings our docker/docker dependency (which is hosted at github.com/moby/moby) up to 23.0.3 (forward about 2 years). Refactored our use of docker/libnetwork to reference the package in its new home, which is docker/docker/libnetwork (it is no longer an independent repository). Some minor nearby test case cleanup as well. * add cl	2023-04-12 14:13:36 -05:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Michael Schurter	db2325b17a	docker: default device.container_path to host_path (#16811 ) * docker: default device.container_path to host_path Matches docker cli behavior. Fixes #16754	2023-04-06 14:44:33 -07:00
Lance Haig	3160c76209	deps: Update ioutil library references to os and io respectively for drivers package (#16331 ) * Update ioutil library references to os and io respectively for drivers package No user facing changes so I assume no change log is required * Fix failing tests	2023-03-08 10:31:09 -06:00
Seth Hoenig	dab4d7ed7a	ci: swap freeport for portal in packages (#15661 )	2023-01-03 11:25:20 -06:00
Michael Schurter	1bc5c718b4	Data race fixes in tests and a new semgrep rule (#14594 ) * test: don't use loop vars in goroutines fixes a data race in the test * test: copy objects in statestore before mutating fixes data race in test * test: @lgfa29's segmgrep rule for loops/goroutines Found 2 places where we were improperly using loop variables inside goroutines.	2022-09-15 10:35:08 -07:00
Seth Hoenig	0bd42a501c	docker: create a docker task config setting for disable built-in healthcheck This PR adds a docker driver task configuration setting for turning off built-in HEALTHCHECK of a container. References) https://docs.docker.com/engine/reference/builder/#healthcheck https://github.com/docker/engine-api/blob/master/types/container/config.go#L16 Closes #5310 Closes #14068	2022-08-11 10:33:48 -05:00
Tim Gross	2415d72bb6	test: disable docker OOM detection test on cgroups v2 (#13928 ) OOM detection under cgroups v2 is flaky under versions of `containerd` before v1.6.3, but our `containerd` dependency is transitive on `moby/moby`, who have not yet updated. Disable this test for cgroups v2 environments until we can update the dependency chain.	2022-07-28 14:47:06 -04:00
Seth Hoenig	410834b705	drivers/docker: do not set cgroup parent in v1 mode This PR fixes a bug where the CgroupParent on the docker HostConfig struct was accidently being set when running in cgroups v1 mode.	2022-05-24 11:22:50 -05:00
Seth Hoenig	d91e4160da	cli: update default redis and use nomad service discovery Closes #12927 Closes #12958 This PR updates the version of redis used in our examples from 3.2 to 7. The old version is very not supported anymore, and we should be setting a good example by using a supported version. The long-form example job is now fixed so that the service stanza uses nomad as the service discovery provider, and so now the job runs without a requirement of having Consul running and configured.	2022-05-17 10:24:19 -05:00
Eng Zer Jun	fca4ee8e05	test: use `T.TempDir` to create temporary test directory (#12853 ) * test: use `T.TempDir` to create temporary test directory This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The directory created by `t.TempDir` is automatically removed when the test and all its subtests complete. Prior to this commit, temporary directory created using `ioutil.TempDir` needs to be removed manually by calling `os.RemoveAll`, which is omitted in some tests. The error handling boilerplate e.g. defer func() { if err := os.RemoveAll(dir); err != nil { t.Fatal(err) } } is also tedious, but `t.TempDir` handles this for us nicely. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix TestLogmon_Start_restart on Windows Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestConsul_Integration t.TempDir fails to perform the cleanup properly because the folder is still in use testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-05-12 11:42:40 -04:00
Seth Hoenig	5da1a31e94	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
Seth Hoenig	b242957990	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Shishir Mahajan	479442e682	Add support for --init to docker driver. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2021-10-15 12:53:25 -07:00
Mahmood Ali	6c414cd5f9	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Seth Hoenig	c34beb48b1	drivers/docker: reuse capabilities plumbing in docker driver This changeset does not introduce any functional change for the docker driver, but rather cleans up the implementation around computing configured capabilities by re-using code written for the exec/java task drivers.	2021-05-17 12:37:40 -06:00
Seth Hoenig	003d68fe6d	drivers/docker+exec+java: disable net_raw capability by default The default Linux Capabilities set enabled by the docker, exec, and java task drivers includes CAP_NET_RAW (for making ping just work), which has the side affect of opening an ARP DoS/MiTM attack between tasks using bridge networking on the same host network. https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities This PR disables CAP_NET_RAW for the docker, exec, and java task drivers. The previous behavior can be restored for docker using the allow_caps docker plugin configuration option. A future version of nomad will enable similar configurability for the exec and java task drivers.	2021-05-12 13:22:09 -07:00
Isabel Suchanek	276644470e	Clean up docker driver test to make it less flaky (#10559 ) Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2021-05-10 14:58:19 -07:00
Isabel Suchanek	1b2296400b	Fix test panic in docker driver test	2021-05-07 12:12:33 -07:00
Isabel Suchanek	379c09513c	drivers/docker: add support for STOPSIGNAL This fixes a bug where Nomad overrides a Dockerfile's STOPSIGNAL with the default kill_signal (SIGTERM). This adds a check for kill_signal. If it's not set, it calls StopContainer instead of Signal, which uses STOPSIGNAL if it's specified. If both kill_signal and STOPSIGNAL are set, Nomad tries to stop the container with kill_signal first, before then calling StopContainer. Fixes #9989	2021-05-05 10:27:58 -07:00
Mahmood Ali	b1ff06fd19	oversubscription: docker to honor MemoryMaxMB values	2021-03-30 16:55:58 -04:00
Florian Apolloner	8b3ea4ea9a	docker: support configuring default log driver in plugin options	2021-03-12 16:04:33 -05:00
Adrian Todorov	2748d2a895	driver/docker: add extra labels ( job name, task and task group name)	2021-03-08 08:59:52 -05:00
Mahmood Ali	8879645ab9	docker: introduce a new hcl2-friendly `mount` syntax (#9635 ) Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label: ```hcl config { image = "..." mount { type = "..." target = "target-path" volume_options { ... } } } ``` The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 . The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could. This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .	2020-12-15 14:13:50 -05:00

1 2 3

137 Commits