nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Aimee Ukasick	c839f38cab	Docs: Golden Versions updates (#24153 ) * Add language from CLI help to job revert for version\|tag * Add CLI job tag subcommand page * Add API create delete tag Examples use same names between CLI and API * Update CLI revert, tag; API jobs * Add job version content * add tag name unique per job to CLI/API; address Phil's feedback Add partial explaining why tag, add to CLI/API * Add diff_version to API jobs list job versions * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * remove tutorial links since not published yet. --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2024-10-11 12:36:32 -05:00
Tim Gross	4de1665942	consul: improve reliability of deregistration (#24166 ) When the local Consul agent receives a deregister request, it performs a pre-flight check using the locally cached ACL token. The agent then sends the request upstream to the Consul servers as part of anti-entropy, using its own token. This requires that the token we use for deregistration is valid even though that's not the token used to write to the Consul server. There are several cases where the service identity token might no longer exist at the time of deregistration: * A race condition between the sync and destroying the allocation. * Misconfiguration of the Consul auth method with a TTL. * Out-of-band destruction of the token. Additionally, Nomad's sync with Consul returns early if there are any errors, which means that a single broken token can prevent any other service on the Nomad agent from being registered or deregistered. Update Nomad's sync with Consul to use the Nomad agent's own Consul token for deregistration, regardless of which token the service was registered with. Accumulate errors from the sync so that they no longer block deregistration of other services. Fixes: https://github.com/hashicorp/nomad/issues/20159	2024-10-11 12:32:23 -04:00
Tim Gross	5bb6d96773	build: update versions file for backports (#24174 )	2024-10-11 12:30:34 -04:00
Seth Hoenig	f1ce127524	jobspec: add a chown option to artifact block (#24157 ) * jobspec: add a chown option to artifact block This PR adds a boolean 'chown' field to the artifact block. It indicates whether the Nomad client should chown the downloaded files and directories to be owned by the task.user. This is useful for drivers like raw_exec and exec2 which are subject to the host filesystem user permissions structure. Before, these drivers might not be able to use or manage the downloaded artifacts since they would be owned by the root user on a typical Nomad client configuration. * api: no need for pointer of chown field	2024-10-11 11:30:27 -05:00
Tim Gross	7381f8419b	docs: clarify requirements for Consul token policies and TTLs (#24167 ) As of #24166, Nomad agents will use their own token to deregister services and checks from Consul. This returns the deregistration path to the pre-Workload Identity workflow. Expand the documentation to make clear why certain ACL policies are required for clients. Additionally, we did not explicitly call out that auth methods should not set an expiration on Consul tokens. Nomad does not have a facility to refresh these tokens if they expire. Even if Nomad could, there's no way to re-inject them into Envoy sidecars for Consul Service Mesh without recreating the task anyways, which is what happens today. Warn users that they should not set an expiration. Closes: https://github.com/hashicorp/nomad/issues/20185 (wontfix) Ref: https://hashicorp.atlassian.net/browse/NET-10262	2024-10-11 11:59:21 -04:00
Daniel Bennett	373aae7b32	docs: add Resource Quota specification page (#24152 ) and update some related pages Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-10-10 15:03:10 -05:00
Daniel Bennett	278a2df3af	e2e: ui: update playwright to 1.48.0 (#24158 ) steps to update: * edit run.sh IMAGE variable manually * run ./run.sh test	2024-10-09 10:34:53 -05:00
Phil Renaud	dc45066ae7	[ui] Separate Diffs and Versions from the /versions endpoint as far as Ember is concerned (#24145 ) * Separate Diffs and Versions from the /versions endpoint as far as Ember is concerned * Back to async true * Handle undefined-diffs case	2024-10-08 12:13:01 -04:00
the-sun-will-rise-tomorrow	1ba9cc266c	docs: Link directly to podman's --network option (#24149 )	2024-10-08 09:05:14 -05:00
Daniel Bennett	4562b9ac8a	Release/1.9.0 beta.2	2024-10-04 14:07:13 -05:00
hc-github-team-nomad-core	7d7a88d7e0	Prepare for next release	2024-10-04 16:18:34 +00:00
hc-github-team-nomad-core	668a827b2b	Generate files for 1.9.0-beta.2 release	2024-10-04 16:18:27 +00:00
Daniel Bennett	3f1bba1643	Prepare release 1.9.0-beta.2	2024-10-04 12:13:01 -04:00
Tim Gross	7531b7a62f	fix data race in node upsert (#24127 ) While testing with agents built with the race-detection option enabled, I encountered a data race while draining a node. When we upsert a node we copy the `NodeResources` struct and then perform a fixup for backwards compatibility of the topology struct. This fixup was being executed on the original struct and not the copy, which means we're uselessly fixing up the wrong struct and we're corrupting the state store in the process (albeit harmlessly, I suspect). Fix the data race by calling the method on the correct pointer.	2024-10-04 08:41:14 -04:00
Daniel Bennett	1c76dd9c1c	update example device readme (#24124 )	2024-10-03 13:24:58 -05:00
Tim Gross	b7595c646d	alloc fs: use case-insensitive check for reads of secret/private dir (#24125 ) When using the Client FS APIs, we check to ensure that reads don't traverse into the allocation's secret dir and private dir. But this check can be bypassed on case-insensitive file systems (ex. Windows, macOS, and Linux with obscure ext4 options enabled). This allows a user with `read-fs` permissions but not `alloc-exec` permissions to read from the secrets dir. This changeset updates the check so that it's case-insensitive. This risks false positives for escape (see linked Go issue), but only if a task without filesystem isolation deliberately writes into the task working directory to do so, which is a fail-safe failure mode. Ref: https://github.com/golang/go/issues/18358 Co-authored-by: dduzgun-security <deniz.duzgun@hashicorp.com>	2024-10-03 14:20:24 -04:00
Michael Schurter	da75d4ff4b	docs: fix aed -> aead typo (#24123 )	2024-10-03 13:31:32 -04:00
Tim Gross	f7d4bd2fd1	test: wait for keyring in plan submission tests (#24122 ) In #23977 we merged a change to how the keyring was stored. Because keyring initialization takes slightly longer now, this uncovered existing timing bugs in some of our tests where tests that require the keyring (ex. plan applier tests) were waiting for the leader but not the keyring initialization. Fix another example we've seen causing test flakes.	2024-10-03 13:22:41 -04:00
Daniel Bennett	7526c91ccd	scheduler: non-nil err when no devices match (#24118 )	2024-10-03 10:29:36 -05:00
Aimee Ukasick	4c131229f4	Add devices to NUMA section of CPU page (#24113 )	2024-10-03 09:09:10 -05:00
Aimee Ukasick	e5b18affa1	nvidia driver: add MIG support to overview paragraph (#24099 )	2024-10-03 09:08:43 -05:00
James Rasell	1fabbaa179	driver: remove LXC and ECS driver documentation. (#24107 ) Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2024-10-03 08:55:39 +01:00
Phil Renaud	2fc7544ff3	[ui] Modify variable access permissions for UI users with write in only certain namespaces (#24073 ) * Modify variable access permissions for UI users with write in only certain namespaces * Addressing some PR comments * Variables index namespaces on * and ability checks are now namespaced * Mistook Delete for Destroy, and update unit tests for mult-return allPaths	2024-10-02 16:02:40 -04:00
Tim Gross	64881eefce	docs: remove references to serf.io site (#24114 ) The serf.io site is being taken down, so change all our links to point to the repo docs instead. Ref: https://github.com/hashicorp/serf/pull/743	2024-10-02 14:33:04 -04:00
Daniel Bennett	6b9bcb8582	differently exclude tagged job versions from being pruned (#24102 ) * test bug: tagged versions count against limit specifically tagged versions that are not the oldest * fix: use original logic, sans tagged versions	2024-10-02 09:58:35 -05:00
Martijn Vegter	3ecf0d21e2	metrics: introduce client config to include alloc metadata as part of the base labels (#23964 )	2024-10-02 10:55:44 -04:00
Tim Gross	6c03e1991d	refactor: clean up slice initialization in node status (#24109 ) We initialize this slice with a zeroed array and then append to it, which means we then have to clean out the empty strings later. Initialize to the correct capacity up front so there are no empty values. Ref: https://github.com/hashicorp/nomad/pull/24104	2024-10-02 10:40:32 -04:00
Tim Gross	7dc57efe1b	build: update go toolchain to 1.23.2 (#24108 ) Picks up some small bug fixes but one especially relevant to Nomad is the `os/exec` file descriptor, which could impact script check / change mode for task drivers without isolated exec (ex. `raw_exec`). Ref: https://github.com/golang/go/issues?q=milestone%3AGo1.23.2+label%3ACherryPickApproved Ref: https://github.com/golang/go/issues/69402	2024-10-02 10:29:10 -04:00
Tim Gross	651d8d6f88	tests: fixup copywrite in test file (#24101 ) In #24007 we merged new HCL files but they were missing copywrite headers because the scan didn't run on this PR for some reason. I've already backported this to the Enterprise branches.	2024-10-01 16:43:10 -04:00
Tim Gross	e9ba630639	docker: fix script check execution (#24098 ) In #24095 we made a fix for non-streaming exec into Docker tasks for script checks and `change_mode = "script"`, but didn't complete E2E testing. We need to use `ContainerExecAttach` in the new API in order to get stdout/stderr from tasklets, but the previous `ContainerExecStart` call will prevent this from running successfully with an error that the exec has already run. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618) * This has shipped in Nomad 1.9.0-beta.1 but not production yet. * This should fix the remaining issues in nightly E2E for Docker.	2024-10-01 16:41:38 -04:00
Juliano Martinez	4a74fda8ce	Allow client template config block to be parsed when using json config (#24007 ) - Adds tests - Adds sample test data for parsing hcl and json - Adds changelog	2024-10-01 15:44:36 -04:00
Seth Hoenig	8ae7f21d41	docs: stats_period device configuration no longer exists (#24097 )	2024-10-01 13:47:04 -05:00
Tim Gross	5e1ad14f1f	scaling policy: use request namespace as target if unset in jobspec (#24065 ) When jobs are submitted with a scaling policy, the scaling policy's target only includes the job's namespace if the `namespace` field is set in the jobspec and not from the request. Normally jobs are canonicalized in the RPC handler before being written to Raft. But the scaling policy targets are instead written during the conversion from `api.Job` to `structs.Job`. We populate the `structs.Job` namespace from the request here as well, but only after the conversion has occurred. Swap the order of these operations so that the conversion is always happening with a correct namespace. Long-term we should not be making mutations during conversion either. But we can't remove it immediately because API requests may come from any agent across upgrades. Move the scaling target creation into the `Canonicalize` method and mark it for future removal in the API conversion code path. Fixes: https://github.com/hashicorp/nomad/issues/24039	2024-10-01 11:41:40 -04:00
Tim Gross	7a88d5d626	docker: fix non-streaming exec attachment (#24095 ) In ##23966 when we switched to using the official Docker SDK client, this included new API calls for attaching to the "exec objects" created for running processes inside a running Docker task. When we updated the API for the non-streaming cases (script health checks, and `change_mode = "script"`), we used the container ID and not the exec object ID. These IDs aren't identical because you can have multiple exec objects for a given container. This results in errors like "unable to upgrade to tcp, received 404" because the Docker API can't find the exec object with the container ID. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=551618) * This has shipped in Nomad 1.9.0-beta.1 but not production yet.	2024-10-01 11:27:13 -04:00
Tim Gross	bf0a65f2d6	docker: reset timer after collecting stats (#24092 ) In ##23966 when we switched to using the official Docker SDK client, we had to rework the stats collection loop for the new client. But we missed resetting the timer on the collection loop, which meant that we'd only collect stats once and then never again. * Ref: [NET-11202 (comment)](https://hashicorp.atlassian.net/browse/NET-11202?focusedCommentId=550814) * This has shipped in Nomad 1.9.0-beta.1 but not production yet.	2024-10-01 08:31:03 -04:00
Adrian Todorov	2444cc3504	docs: small updates to Nomad as an AWS OIDC Provider docs (#24078 ) A few small updates to the recent "Federate access to AWS with Nomad Workload Identity" documentation, most notably that restart isn't needed because AWS SDKs handle OIDC reauth gracefully (unlike any other type of auth - for all others it's cached statically on startup, so nothing but a full restart works in case your credentials expire).	2024-09-30 11:02:09 -04:00
dependabot[bot]	52b2711874	chore(deps): bump actions/checkout from 4.1.7 to 4.2.0 (#24083 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.7 to 4.2.0. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](`692973e3d9...d632683dd7`) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 08:11:54 -05:00
Tim Gross	154aeb77af	docker: fix bug in waiting for container to exit (#24081 ) In ##23966 when we switched to using the official Docker SDK client, we had more contexts to add because most of the library methods take one. But for some APIs like waiting for a container to exit after we've started it, we never want to close this context, because the operation can outlive the Nomad agent itself.	2024-09-30 08:50:07 -04:00
dependabot[bot]	242de8abaa	chore(deps): bump github.com/hashicorp/go-secure-stdlib/listenerutil (#24084 ) Bumps [github.com/hashicorp/go-secure-stdlib/listenerutil](https://github.com/hashicorp/go-secure-stdlib) from 0.1.4 to 0.1.9. - [Release notes](https://github.com/hashicorp/go-secure-stdlib/releases) - [Commits](https://github.com/hashicorp/go-secure-stdlib/compare/awsutil/v0.1.4...listenerutil/v0.1.9) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-secure-stdlib/listenerutil dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 10:19:14 +02:00
dependabot[bot]	69766a080d	chore(deps): bump github.com/hashicorp/go-kms-wrapping/wrappers/transit/v2 (#24085 ) Bumps [github.com/hashicorp/go-kms-wrapping/wrappers/transit/v2](https://github.com/hashicorp/go-kms-wrapping) from 2.0.11 to 2.0.12. - [Commits](https://github.com/hashicorp/go-kms-wrapping/compare/v2.0.11...v2.0.12) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-kms-wrapping/wrappers/transit/v2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 10:18:04 +02:00
dependabot[bot]	21901197a3	chore(deps): bump github.com/hashicorp/go-version from 1.6.0 to 1.7.0 (#24086 ) Bumps [github.com/hashicorp/go-version](https://github.com/hashicorp/go-version) from 1.6.0 to 1.7.0. - [Release notes](https://github.com/hashicorp/go-version/releases) - [Changelog](https://github.com/hashicorp/go-version/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/go-version/compare/v1.6.0...v1.7.0) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-version dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 10:17:09 +02:00
dependabot[bot]	d667316178	chore(deps): bump golang.org/x/mod from 0.18.0 to 0.21.0 (#24087 ) Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.18.0 to 0.21.0. - [Commits](https://github.com/golang/mod/compare/v0.18.0...v0.21.0) --- updated-dependencies: - dependency-name: golang.org/x/mod dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 10:16:40 +02:00
Aimee Ukasick	5f92ccbfb2	Docs: Terraform prereq clarification (#24069 ) Clarify Terraform prereq since you don't need to install the Terraform CLI locally. Fixes: [CE-726](https://hashicorp.atlassian.net/browse/CE-726) [CE-726]: https://hashicorp.atlassian.net/browse/CE-726?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ	2024-09-27 13:47:10 -04:00
Michael Schurter	34cb05d297	docs: explain how to use dots in docker labels (#24074 ) Nomad v1.9.0 (finally!) removes support for HCL1 and the `-hcl1` flag. See #23912 for details. One of the uses of HCL1 over HCL2 was that HCL1 allowed quoted keys in blocks such as env, meta, and Docker's labels: ```hcl some_block { "foo.bar" = "baz" } ``` This works in HCL1 but is invalid HCL2. In HCL2 you must use a map instead of a block: ```hcl some_map = { "eggs.spam" = "works!" } ``` This was such a hassle for users we special cased the `env` and `meta` blocks to be accepted as blocks or maps in #9936. However Docker `labels`, being a task config option, is much harder to special case and commonly needs dots-in-keys for things like DataDog autodiscovery via Docker container labels: https://docs.datadoghq.com/containers/docker/integrations/?tab=labels Luckily `labels` can be specified as a list-of-maps instead: ```hcl labels = [ { "com.datadoghq.ad.check_names" = "[\"openmetrics\"]" "com.datadoghq.ad.init_configs" = "[{}]" } ] ``` So instead of adding more awkward hcl1/2 backward compat code to Nomad, I just updated the docs to hopefully help people hit by this. The only other known workaround is dropping HCL in favor of JSON jobspecs altogether, but that forces a huge migration and maintenance burden on users: https://discuss.hashicorp.com/t/docker-based-autodiscovery-with-datadog-how-can-we-make-it-work/18870	2024-09-27 10:02:50 -07:00
Piotr Kazmierczak	ec42aa2a1b	docker: use docker errdefs instead of string comparisons when checking errors (#24075 )	2024-09-27 15:32:29 +02:00
Phil Renaud	c1127db015	Changelog for Golden Versions added (#24072 )	2024-09-26 15:38:39 -04:00
Tim Gross	ee9eb4574b	Release/1.9.0 beta.1 (#24070 )	2024-09-26 15:17:07 -04:00
Tim Gross	116557faf3	correct LAST_RELEASE in makefile	2024-09-26 15:04:05 -04:00
hc-github-team-nomad-core	ecd3b42207	Prepare for next release	2024-09-26 17:36:04 +00:00
hc-github-team-nomad-core	07dc87eb21	Generate files for 1.9.0-beta.1 release	2024-09-26 17:35:57 +00:00

1 2 3 4 5 ...

26223 Commits