nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Tim Gross	fbcdb125da	end-to-end testing improvements for CSI (#26834 ) While working on #26831 and #26832 I made some minor improvements to our end-to-end test setup for CSI: * bump the AWS EBS plugin versions to latest release (1.48.0) * remove the unnnecessary `datacenters` field from the AWS EBS plugin jobs * add a name tag to the EBS volumes we create * add a user-specific name tag to the cluster name when using the makefile to deploy a cluster * add volumes and other missing variables from the `provision-infra` module to the main E2E module Ref: https://github.com/hashicorp/nomad/pull/26832 Ref: https://github.com/hashicorp/nomad/pull/26831	2025-09-25 09:27:15 -04:00
Tim Gross	40241b261b	CSI: ensure only client-terminal allocs are treated as past claims (#26831 ) The volume watcher checks whether any allocations that have claims are terminal so that it knows if it's safe to unpublish the volume. This check was considering a claim as unpublishable if the allocation was terminal on either the server or client, rather than the client alone. In many circumstances this is safe. But if an allocation takes a while to stop (ex. it has a `shutdown_delay`), it's possible for garbage collection to run in the window between when the alloc is marked server-terminal and when the task is actually stopped. The server unpublishes the volume which sends a node plugin RPC. The plugin unmounts the volume while it's in use, and then unmounts it again when the allocation stops and the CSI postrun hook runs. If the task writes to the volume during the unmounting process, some providers end up in a broken state and the volume is not usable unless it's detached and reattached. Fix this by considering a claim a "past claim" only when the allocation is client terminal. This way if garbage collection runs while we're waiting for allocation shutdown, the alloc will only be server-terminal and we won't send the extra node RPCs. Fixes: https://github.com/hashicorp/nomad/issues/24130 Fixes: https://github.com/hashicorp/nomad/issues/25819 Ref: https://hashicorp.atlassian.net/browse/NMD-1001	2025-09-25 09:24:53 -04:00
James Rasell	c80c60965f	node pool: Allow specifying node identity ttl in HCL or JSON spec. (#26825 ) The node identity TTL defaults to 24hr but can be altered by setting the node identity TTL parameter. In order to allow setting and viewing the value, the field is now plumbed through the CLI and HTTP API. In order to parse the HCL, a new helper package has been created which contains generic parsing and decoding functionality for dealing with HCL that contains time durations. hclsimple can be used when this functionality is not needed. In order to parse the JSON, custom marshal and unmarshal functions have been created as used in many other places. The node pool init command has been updated to include this new parameter, although commented out, so reference. The info command now includes the TTL in its output too.	2025-09-24 14:20:34 +01:00
Aimee Ukasick	6d4c8b3efe	Update CODEOWNERS (#26827 ) change web-presence to web-devdot so web engineers not on the devdot team don't get assigned	2025-09-23 09:22:23 -05:00
Daniel Bennett	1d6fddd11f	build: ui: setup-node v4.4.0 (#26826 ) for actions/cache upgrade, specifically to account for https://github.com/actions/toolkit/discussions/1890	2025-09-22 15:35:09 -04:00
dependabot[bot]	ccd497b46f	chore(deps): bump github.com/shoenig/go-m1cpu from 0.1.6 to 0.1.7 (#26817 ) Bumps [github.com/shoenig/go-m1cpu](https://github.com/shoenig/go-m1cpu) from 0.1.6 to 0.1.7. - [Release notes](https://github.com/shoenig/go-m1cpu/releases) - [Commits](https://github.com/shoenig/go-m1cpu/compare/v0.1.6...v0.1.7) --- updated-dependencies: - dependency-name: github.com/shoenig/go-m1cpu dependency-version: 0.1.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 09:58:50 +02:00
dependabot[bot]	63e4376d3c	chore(deps): bump golang.org/x/mod from 0.27.0 to 0.28.0 (#26814 ) Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.27.0 to 0.28.0. - [Commits](https://github.com/golang/mod/compare/v0.27.0...v0.28.0) --- updated-dependencies: - dependency-name: golang.org/x/mod dependency-version: 0.28.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 09:58:28 +02:00
Tim Gross	b5530128df	docs: expand on allocation GC details (#26792 ) Expand on the documentation of allocation garbage collection: * Explain that server-side GC of allocations is tied to the GC of the evaluation that spawned the allocation. * Explain that server-side GC of allocations will force them to be immediately GC'd on the client regardless of the client-side configurations. Ref: https://github.com/hashicorp/nomad/issues/26765 Co-authored-by: Aimee Ukasick <Aimee.Ukasick@ibm.com> Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-09-19 12:17:17 -04:00
Aimee Ukasick	377674f93e	Contributing README: Add section for creating an issue (#26805 ) * Add section for creating an issue * incorporate feedback Co-authored-by: Tim Gross <tgross@hashicorp.com> * Update contributing/README.md --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-09-19 10:49:54 -05:00
Piotr Kazmierczak	ceeee1f68c	e2e: set longer job submission time for failing system jobs (#26809 ) I cannot replicate this locally, but it appears that on CI some of our system jobs take longer than the default 20s to finish deploying. This PR is just to make sure this isn't the reason these tests fail.	2025-09-19 17:25:53 +02:00
Tim Gross	0367b60ca9	changelog for Nomad Enterprise 1.9.13+ent and 1.8.17+ent (#26806 ) Due to the delayed release of Nomad Enterprise, we didn't have the changelog entries for these two releases.	2025-09-19 11:22:58 -04:00
Piotr Kazmierczak	f767db5639	e2e: fix TestScaling/TestScaling_System (#26804 )	2025-09-19 15:31:32 +02:00
James Rasell	8e553ad95b	build: Add tzdata to Docker container final image. (#26794 ) Nomad's periodic block includes a "time_zone" parameter which lets operators set the time zone at which the next launch interval is checked against. For this to work, Nomad needs to use the "time.LoadLocation" which in-turn can use multiple TZ data sources. When using the Docker image to trigger Nomad job registrations, it currently does not have access to any TZ data, meaning it is only aware of UTC. Adding the tzdata package contents to the release image provides the required data for this to work. It would have also been possible to set the "-tags" build tag when releasing Nomad which would embed a copy of the timezone database in the code. We decided against using the build tag approach as it is a subtle way that we could introduce bugs that are very difficult to track down and we prefer the commit approach.	2025-09-19 08:55:57 +01:00
ethel-hashicorp	6ea57a589d	SMRE-733: Updates post-install text to properly reflect the updated IPLA blurb (#26791 )	2025-09-19 07:35:58 +01:00
Piotr Kazmierczak	f42239bf6c	api: add DefaultUpdateStrategy to system jobs if missing (#26777 ) From 1.11, Nomad system jobs will feature deployments, and thus jobspecs missing an update block should be canonicalized to have one.	2025-09-18 15:21:23 +02:00
Tim Gross	3ef25e5867	ACL: allow workload identities to list/get their own policies (#26772 ) In most RPC endpoints we use the resolved ACL object to determine whether a given auth token or identity has access to the object of interest to the RPC. In #15870 we adjusted this across most of the RPCs to handle workload identity. But in the ACL endpoints that read policies, we can't use the resolved ACL object and have to go back to the original token and lookup the policies it has access to. So we need to resolve any workload-associated policies during that lookup as well. Fixes: https://github.com/hashicorp/nomad/issues/26764 Ref: https://hashicorp.atlassian.net/browse/NMD-990 Ref: https://github.com/hashicorp/nomad/pull/15870	2025-09-18 09:10:37 -04:00
James Rasell	a206ff3858	test: Fix test flake in client get registration token (#26796 ) The test was incorrectly writing to state that registration had been finished before writing the node identity token. This is the opposite of what happens in the client code and caused a timing issue which meant we read registration as completed before we had the identity available and therefore returned the secret ID.	2025-09-18 13:56:17 +01:00
Piotr Kazmierczak	46dfd9d992	scheduler: do not create deployments for system job reschedules (#26789 ) System jobs that get rescheduled should not get new deployments.	2025-09-18 14:54:54 +02:00
Tim Gross	3432b0a2d6	consul: only add fingerprint link if unique.consul.name is set (#26787 ) In Nomad Enterprise we can fingerprint multiple Consul datacenters. If neither is `"default"` then we end up with warning logs about adding a "link". The `Link` field on the `Node` struct is a map of attributes that only contributes to the node's computed hash. The `"consul"` key's value is derived from the `unique.consul.name` attribute, which only exists if there's a default Consul cluster. Update the fingerprint to skip setting the link field if there's no `unique.consul.name`, and lower the warning log for malformed fields to debug; this is a minor scheduling optimization largely captured by existing Consul fields in the node computed class. The only reason not to remove it entirely is to avoid changing computed classes on existing large clusters. Fixes: https://github.com/hashicorp/nomad/issues/26781 Ref: https://hashicorp.atlassian.net/browse/NMD-998	2025-09-17 13:23:01 -04:00
Jeff Boruszak	6dce21bc85	Merge pull request #26682 from hashicorp/docs/versioned-redirect-fix docs: Versioned docs redirect fixes	2025-09-17 08:58:37 -07:00
Tim Gross	4e75e99f1a	windows: use/accept platform-specific signal for stopping agent (#26780 ) On Windows, the `os.Process.Signal` method returns an error when sending `os.Interrupt` (SIGINT) because it isn't implemented. This causes test servers in the `testutil` packages to break on Windows. Use the platform specific syscalls to generate the SIGINT instead. The agent's signal handler also did not correctly handle the Ctrl-C because we were masking os.Interrupt instead of SIGINT. Fixes: https://github.com/hashicorp/nomad/issues/26775 Co-authored-by: Chris Roberts <croberts@hashicorp.com>	2025-09-17 11:32:20 -04:00
Aimee Ukasick	fca783c566	Add 1.10.5 release notes (#26782 )	2025-09-17 08:59:43 -05:00
James Rasell	ac5a77af56	docs: Add client identity HTTP API detail on api-docs page. (#26774 ) Co-authored-by: Aimee Ukasick <Aimee.Ukasick@ibm.com>	2025-09-17 14:05:37 +01:00
Piotr Kazmierczak	4874622ebd	e2e: test canary updates for system jobs (#26776 )	2025-09-17 10:20:03 +02:00
boruszak	8ab61f37b3	Fix accidental "s	2025-09-16 14:23:59 -07:00
Michael Smithhisler	1a19a16ee9	docs: fix link in multiregion job spec page (#26755 )	2025-09-16 13:00:42 -05:00
James Rasell	2abd72d433	http: Fix client identity renew call when node ID is in URI. (#26773 ) When calling the client identity renew API, it is possible the target node ID is provided by either the URI or within the request body. This change fixes a bug where all calls using a node_id query parameter would be reject as it failed to decode the empty request body. Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-09-16 15:15:39 +01:00
Olli Janatuinen	6398ef9475	secrets: Support custom plugins in Windows (#26751 ) Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>	2025-09-16 09:14:50 -04:00
Daniel Bennett	f47cb5d10f	e2e: adjust flaky timings (#26771 ) hopefully fixes: ``` TestOversubscription/testExec: oversubscription_test.go:57: submitting job: "./input/exec.hcl" oversubscription_test.go:72: oversubscription_test.go:72: expected condition to pass within wait context ↪ error: wait: timeout exceeded: expect '31457280' in stdout, got: 'stat {...}/cat.stdout.0: no such file or directory' ``` and in separate runs, ``` TestTaskAPI/testTaskAPI_Auth: taskapi_test.go:85: taskapi_test.go:85: expected string to have suffix ↪ suffix: Unauthorized ↪ string: ``` ``` TestTaskAPI/testTaskAPI_Auth: taskapi_test.go:85: taskapi_test.go:85: expected string to have suffix ↪ suffix: Forbidden ↪ string: ```	2025-09-15 15:54:53 -04:00
dependabot[bot]	ababacc9ab	chore(deps): bump github.com/shoenig/test from 1.12.1 to 1.12.2 in /api (#26757 ) * chore(deps): bump github.com/shoenig/test from 1.12.1 to 1.12.2 in /api Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.12.1 to 1.12.2. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v1.12.1...v1.12.2) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-version: 1.12.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * root dep needs to be updated too --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-09-15 09:06:41 -04:00
dependabot[bot]	2baeffec92	chore(deps-dev): bump prettier from 3.5.3 to 3.6.2 in /website (#26162 ) Bumps [prettier](https://github.com/prettier/prettier) from 3.5.3 to 3.6.2. - [Release notes](https://github.com/prettier/prettier/releases) - [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md) - [Commits](https://github.com/prettier/prettier/compare/3.5.3...3.6.2) --- updated-dependencies: - dependency-name: prettier dependency-version: 3.6.2 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:51:31 -04:00
dependabot[bot]	be1fdc0d53	chore(deps): bump golang.org/x/crypto from 0.41.0 to 0.42.0 (#26758 ) Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.41.0 to 0.42.0. - [Commits](https://github.com/golang/crypto/compare/v0.41.0...v0.42.0) --- updated-dependencies: - dependency-name: golang.org/x/crypto dependency-version: 0.42.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:48:31 -04:00
dependabot[bot]	16533b3d34	chore(deps): bump google.golang.org/grpc from 1.75.0 to 1.75.1 (#26760 ) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.75.0 to 1.75.1. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.75.0...v1.75.1) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-version: 1.75.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:48:15 -04:00
dependabot[bot]	24ef9fa928	chore(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#26762 ) Bumps [github.com/aws/aws-sdk-go-v2/feature/ec2/imds](https://github.com/aws/aws-sdk-go-v2) from 1.18.6 to 1.18.7. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.7/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.18.6...config/v1.18.7) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/feature/ec2/imds dependency-version: 1.18.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:16:49 -04:00
dependabot[bot]	5d0d5d2b22	chore(deps): bump github.com/zclconf/go-cty from 1.16.4 to 1.17.0 (#26761 ) Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.16.4 to 1.17.0. - [Release notes](https://github.com/zclconf/go-cty/releases) - [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md) - [Commits](https://github.com/zclconf/go-cty/compare/v1.16.4...v1.17.0) --- updated-dependencies: - dependency-name: github.com/zclconf/go-cty dependency-version: 1.17.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:16:37 -04:00
dependabot[bot]	da9a25d77d	chore(deps): bump golang.org/x/time from 0.12.0 to 0.13.0 (#26759 ) Bumps [golang.org/x/time](https://github.com/golang/time) from 0.12.0 to 0.13.0. - [Commits](https://github.com/golang/time/compare/v0.12.0...v0.13.0) --- updated-dependencies: - dependency-name: golang.org/x/time dependency-version: 0.13.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 08:16:12 -04:00
James Rasell	a7db1b42b8	acl: Migrate all tests from testify to must. (#26704 )	2025-09-15 08:21:49 +01:00
Chris Roberts	10be73c081	ci: fix github to jira issue sync (#26747 ) Add local actions for JIRA interactions to replace github actions that have been archived.	2025-09-12 13:40:11 -07:00
Tim Gross	ac86225e09	metrics: reduce heap usage of eval broker metrics (#26737 ) The metrics on the eval broker include labels for the job ID, but under a high volume of dispatch workloads, this results in excessive heap usage on the leader. Dispatch workloads should use their parent ID rather than their child ID for any metrics we collect. Also, eliminate an extra copy of the labels. And remove the extremely high cardinality `"eval_id"` label from the `nomad.broker.eval_waiting` metric. Fixes: https://github.com/hashicorp/nomad/issues/26657	2025-09-12 08:29:46 -04:00
Michael Smithhisler	c20f854d16	client: set network status on tasks when restoring allocations (#26699 ) The allocation network hook was not properly restoring network status from state when the network had previously been setup. This led to missing environment variables, misconfigured hosts file, and resolv.conf when a task was restarted after the nomad agent has restarted. --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-09-11 13:10:21 -04:00
Chris Roberts	8b51acf259	[artifact] fix path within check on trimmed target (#26748 ) When checking if the target path is within the root path, the target path is trimmed and then file information is fetched. If the trimmed path does not exist, then the full target path is not within the root. In the case of receiving a not exist error, simply return false.	2025-09-11 08:59:18 -07:00
Piotr Kazmierczak	8eb72b2868	Post 1.10.5 release (#26749 ) * Generate files for 1.10.5 release * Prepare for next release --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2025-09-11 14:49:12 +02:00
hc-github-team-nomad-core	4c0e5b286b	Prepare for next release	2025-09-11 10:20:15 +02:00
hc-github-team-nomad-core	f9bce13f8c	Generate files for 1.10.5 release	2025-09-11 10:20:15 +02:00
Michael Smithhisler	f58e915bd3	scheduler: allow device count to use different vendors/models (#26649 ) A small optimization in the scheduler required users to specify specific models of devices if the required count was higher than the individual model/vendor on the node. This change removes that optimization to allow for more intuitive device scheduling when different vendor/model device types exist on a node.	2025-09-10 07:12:38 -04:00
tehut	68d767654a	ci: remove mkdir from action for release runners (#26743 )	2025-09-10 09:13:49 +02:00
tehut	bfd64b5f98	build:replicate nomad-enterprise 557e533 (#26741 )	2025-09-09 17:02:08 -07:00
Tim Gross	75774711f0	eliminate dead Vault-related code from `nomad/structs` (#26736 ) When we removed the legacy Vault token workflow, we left behind a few bits of code that only served that workflow. Remove the dead code.	2025-09-09 12:12:57 -04:00
Michael Smithhisler	37da98be1c	Merge pull request #26681 from hashicorp/NMD-760-nomad-secrets-block Secrets Block: merge feature branch to main	2025-09-09 10:46:18 -04:00
Tim Gross	0b69999698	Revert go-getter update (#26731 ) The `go-getter` update in https://github.com/hashicorp/nomad/pull/26713 is not passing tests upstream (apparently https://github.com/hashicorp/go-getter/pull/548 is the origin of the problem but that PR did not ever run tests). The issue being fixed isn't a critical vulnerability, so in the interest of preparing us for the next release, revert the `go-getter` change but keep the Go toolchain update. We'll skip go-getter 1.8.0 and pick up the next patch version once its issues are fixed. Reverts commit `8a96929870`.	2025-09-09 09:28:08 -04:00

1 2 3 4 5 ...

27509 Commits