nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Crypto89	9c4e4afa79	csi: fix CSI ExpandVolume stagingPath (#25253 ) Fix the checking of the staging path against the mountRoot on the host rather then checking against the containerMountPoint which (probably) never exists on the host causing it to default back the the legacy behaviour.	2025-03-25 12:36:46 -05:00
dependabot[bot]	d67a74d0f4	chore(deps): bump github.com/gorilla/websocket in /api (#25502 ) Bumps [github.com/gorilla/websocket](https://github.com/gorilla/websocket) from 1.5.0 to 1.5.3. - [Release notes](https://github.com/gorilla/websocket/releases) - [Commits](https://github.com/gorilla/websocket/compare/v1.5.0...v1.5.3) --- updated-dependencies: - dependency-name: github.com/gorilla/websocket dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 10:53:27 -04:00
dependabot[bot]	f16104ab84	chore(deps): bump github.com/shoenig/test from 1.7.1 to 1.12.1 in /api (#25501 ) Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.7.1 to 1.12.1. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v1.7.1...v1.12.1) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 10:52:56 -04:00
dependabot[bot]	de723690f7	chore(deps): bump github.com/felixge/httpsnoop in /api (#25499 ) Bumps [github.com/felixge/httpsnoop](https://github.com/felixge/httpsnoop) from 1.0.3 to 1.0.4. - [Release notes](https://github.com/felixge/httpsnoop/releases) - [Commits](https://github.com/felixge/httpsnoop/compare/v1.0.3...v1.0.4) --- updated-dependencies: - dependency-name: github.com/felixge/httpsnoop dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 10:02:00 -04:00
James Rasell	ea25503705	cli: Use meta response index to start monitoring volume create. (#25514 )	2025-03-25 14:00:46 +00:00
dependabot[bot]	1ff8d7b3ab	chore(deps): bump github.com/hashicorp/go-plugin from 1.6.2 to 1.6.3 (#25507 ) Bumps [github.com/hashicorp/go-plugin](https://github.com/hashicorp/go-plugin) from 1.6.2 to 1.6.3. - [Release notes](https://github.com/hashicorp/go-plugin/releases) - [Changelog](https://github.com/hashicorp/go-plugin/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/go-plugin/compare/v1.6.2...v1.6.3) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-plugin dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 09:56:59 -04:00
dependabot[bot]	ba35b1d170	chore(deps): bump github.com/shoenig/test from 1.12.0 to 1.12.1 (#25506 ) Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.12.0 to 1.12.1. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v1.12.0...v1.12.1) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 09:56:34 -04:00
dependabot[bot]	809985bbcd	chore(deps): bump golang.org/x/time from 0.10.0 to 0.11.0 (#25505 ) Bumps [golang.org/x/time](https://github.com/golang/time) from 0.10.0 to 0.11.0. - [Commits](https://github.com/golang/time/compare/v0.10.0...v0.11.0) --- updated-dependencies: - dependency-name: golang.org/x/time dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 09:55:47 -04:00
dependabot[bot]	ebe0fe9914	chore(deps): bump github.com/opencontainers/runc from 1.2.5 to 1.2.6 (#25504 ) Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.2.5 to 1.2.6. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/v1.2.6/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.2.5...v1.2.6) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 09:55:13 -04:00
dependabot[bot]	5485457dcc	chore(deps): bump github.com/docker/docker (#25503 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.0.1+incompatible to 28.0.2+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v28.0.1...v28.0.2) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-25 09:54:11 -04:00
Allison Larson	d1d8945d2e	Add docker plugin config option image_pull_timeout value for default timeout (#25489 ) * Add docker plugin config image_pull_timeout value for default timeout * Add image_pull_timeout docker plugin config to docs * Add changelog	2025-03-24 13:03:14 -07:00
Tim Gross	e168548341	provide allocrunner hooks with prebuilt taskenv and fix mutation bugs (#25373 ) Some of our allocrunner hooks require a task environment for interpolating values based on the node or allocation. But several of the hooks accept an already-built environment or builder and then keep that in memory. Both of these retain a copy of all the node attributes and allocation metadata, which balloons memory usage until the allocation is GC'd. While we'd like to look into ways to avoid keeping the allocrunner around entirely (see #25372), for now we can significantly reduce memory usage by creating the task environment on-demand when calling allocrunner methods, rather than persisting it in the allocrunner hooks. In doing so, we uncover two other bugs: * The WID manager, the group service hook, and the checks hook have to interpolate services for specific tasks. They mutated a taskenv builder to do so, but each time they mutate the builder, they write to the same environment map. When a group has multiple tasks, it's possible for one task to set an environment variable that would then be interpolated in the service definition for another task if that task did not have that environment variable. Only the service definition interpolation is impacted. This does not leak env vars across running tasks, as each taskrunner has its own builder. To fix this, we move the `UpdateTask` method off the builder and onto the taskenv as the `WithTask` method. This makes a shallow copy of the taskenv with a deep clone of the environment map used for interpolation, and then overwrites the environment from the task. * The checks hook interpolates Nomad native service checks only on `Prerun` and not on `Update`. This could cause unexpected deregistration and registration of checks during in-place updates. To fix this, we make sure we interpolate in the `Update` method. I also bumped into an incorrectly implemented interface in the CSI hook. I've pulled that and some better guardrails out to https://github.com/hashicorp/nomad/pull/25472. Fixes: https://github.com/hashicorp/nomad/issues/25269 Fixes: https://hashicorp.atlassian.net/browse/NET-12310 Ref: https://github.com/hashicorp/nomad/issues/25372	2025-03-24 12:05:04 -04:00
Tim Gross	ecf3d88e81	dependabot: update reviewer for website directory (#25498 ) When we updated the codeowner for the website directory to include the "web presence" group, we didn't also update the dependabot reviewer. This results in errors in dependabot PRs. Ref: https://github.com/hashicorp/nomad/pull/25492#issuecomment-2746105976	2025-03-24 12:03:02 -04:00
Juana De La Cuesta	2bd5dc5970	Merge pull request #25479 from hashicorp/NET-11546-enos-same-allocs Add a test for re attaching allocs after client restart	2025-03-24 16:03:57 +01:00
Juanadelacuesta	c3258ab0f6	fix: reuse client_id when checking for running allocs	2025-03-24 15:11:33 +01:00
dependabot[bot]	26c2f6bccf	chore(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (#25490 ) Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2. - [Release notes](https://github.com/golang-jwt/jwt/releases) - [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md) - [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.1...v5.2.2) --- updated-dependencies: - dependency-name: github.com/golang-jwt/jwt/v5 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-24 09:27:48 -04:00
dependabot[bot]	997d2b355f	chore(deps): bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2 (#25491 ) Bumps [github.com/golang-jwt/jwt/v4](https://github.com/golang-jwt/jwt) from 4.5.1 to 4.5.2. - [Release notes](https://github.com/golang-jwt/jwt/releases) - [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md) - [Commits](https://github.com/golang-jwt/jwt/compare/v4.5.1...v4.5.2) --- updated-dependencies: - dependency-name: github.com/golang-jwt/jwt/v4 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-24 09:00:16 -04:00
dependabot[bot]	c60b70573c	chore(deps): bump github.com/docker/cli (#25493 ) Bumps [github.com/docker/cli](https://github.com/docker/cli) from 27.5.1+incompatible to 28.0.2+incompatible. - [Commits](https://github.com/docker/cli/compare/v27.5.1...v28.0.2) --- updated-dependencies: - dependency-name: github.com/docker/cli dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-24 08:57:19 -04:00
dependabot[bot]	b3114333a6	chore(deps): bump github.com/klauspost/cpuid/v2 from 2.2.9 to 2.2.10 (#25494 ) Bumps [github.com/klauspost/cpuid/v2](https://github.com/klauspost/cpuid) from 2.2.9 to 2.2.10. - [Release notes](https://github.com/klauspost/cpuid/releases) - [Changelog](https://github.com/klauspost/cpuid/blob/master/.goreleaser.yml) - [Commits](https://github.com/klauspost/cpuid/compare/v2.2.9...v2.2.10) --- updated-dependencies: - dependency-name: github.com/klauspost/cpuid/v2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-24 08:55:48 -04:00
dependabot[bot]	c8804b8e8e	chore(deps): bump github.com/hashicorp/go-secure-stdlib/listenerutil (#25495 ) Bumps [github.com/hashicorp/go-secure-stdlib/listenerutil](https://github.com/hashicorp/go-secure-stdlib) from 0.1.9 to 0.1.10. - [Release notes](https://github.com/hashicorp/go-secure-stdlib/releases) - [Commits](https://github.com/hashicorp/go-secure-stdlib/compare/parseutil/v0.1.9...listenerutil/v0.1.10) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-secure-stdlib/listenerutil dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-24 08:53:57 -04:00
Juana De La Cuesta	2fdca9eb04	Merge pull request #25478 from hashicorp/NET-11546-enos-same-nfs Give the nfs controller a longer start up time	2025-03-24 09:45:12 +01:00
Aimee Ukasick	34ae5d5ae6	Fix link rendering in server.default_scheduler_config (#25482 ) CE-821	2025-03-21 12:50:57 -05:00
Tim Gross	c0c41f11a5	CSI: prevent loop in volumewatcher when node is GC'd (#25428 ) If a CSI volume is has terminal allocations, the volumewatcher will submit an `Unpublish` RPC. But the "past claim" we create is missing the "external" node identifier (ex. the AWS EC2 instance ID). The unpublish RPC can tolerate this if the node still exists in the state store, but if the node has been GC'd the controller unpublish step will return an error. But at this point we've already checkpointed the unpublish workflow, which triggers a notification on the volumewatcher. This results in the volumewatcher getting into a tight loop of retries. Unfortunately even if we somehow break the loop (perhaps because we hit a different code path), we'll kick off this loop again after a leader election when we spin up the volumewatchers again. This changeset includes the following: * Fix the primary bug by including the external node ID when creating a "past claim" for a terminal allocation. * If we can't lookup the external ID because there's no external node ID and the node no longer exists, abandon it in the same way that we do the node unpublish step. * Rate limit the volumewatcher loop so that any future bugs of this type don't cause a tight loop. * Remove some dead code found while working on this. Fixes: https://github.com/hashicorp/nomad/issues/25349 Ref: https://hashicorp.atlassian.net/browse/NET-12298	2025-03-21 13:07:16 -04:00
Juanadelacuesta	ce261be358	style: linter fix	2025-03-21 15:26:25 +01:00
Juanadelacuesta	b1dbc14499	func: make the csi_plugin health timeout a little longer help the test run better locally	2025-03-21 15:16:00 +01:00
Juanadelacuesta	82fcc62c46	func: add verification for allocs correctly reattaching after client restarts	2025-03-21 15:14:00 +01:00
James Rasell	0a29c2f017	ci: Use custom runner for core tests with more CPU and memory. (#25475 )	2025-03-21 13:52:36 +00:00
James Rasell	27ad88ac17	test: Calculate agent endpoint scheduler count, not static. (#25473 )	2025-03-21 13:47:53 +00:00
Tim Gross	c67c4ea182	client: statically assert hook interfaces in build (#25472 ) While working on #25373, I noticed that the CSI hook's `Destroy` method doesn't match the interface, which means it never gets called. Because this method only cancels any in-flight CSI requests, the only impact of this bug is that any CSI RPCs that are in-flight when an alloc is GC'd on the client or a dev agent is shut down won't be interrupted gracefully. Fix the interface, but also make static assertions for all the allocrunner hooks in the production code, so that you can make changes to interfaces and have compile-time assistance in avoiding mistakes. Ref: https://github.com/hashicorp/nomad/pull/25373	2025-03-21 09:14:13 -04:00
Aimee Ukasick	95ee9261a5	Docs: fix broken links in 1.10 beta docs (#25469 ) * Docs: fix 1.10 broken link in operations/stateful-workloads * updated the link in other pages	2025-03-20 13:17:09 -05:00
Michael Schurter	92de40b00d	tests: fixes a few data races in tests (#25455 ) * test: use statedb factory Swapping fields on Client after it has been created is a race. * test: lock before checking heartbeat state Fixes races * test: fix races by copying fsm objects A common source of data races in tests is when they insert a fixture directly into memdb and then later mutate the object. Since objects in the state store are readonly, any later mutation is a data race. * test: lock when peeking at eval stats * test: lock when peeking at serf state * test: lock when looking at stats * test: fix default eval broker state test The test was not applying the config callback. In addition the test raced against the configuration being applied. Waiting for the keyring to be initialized resolved the race in my testing, but given the high concurrency of the various leadership subsystems it's possible it may still flake.	2025-03-20 10:56:17 -07:00
Piotr Kazmierczak	084497c46c	build: split minimum-os job into 2 and only run arm checks on CE (#25467 ) arm GHA runners currently do not support private repositories.	2025-03-20 16:35:42 +01:00
Aimee Ukasick	107289620c	Docs: Add JSON format note to docker driver sysctl parameter (#25454 ) * Docs: Add JSON format note to docker driver sysctl parameter CE-837 * Apply suggestions from code review Co-authored-by: Tim Gross <tgross@hashicorp.com> --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-20 09:22:26 -05:00
Daniel Bennett	8c609ad762	docs: oidc client assertions and pkce (#25375 )	2025-03-20 09:14:17 -05:00
Piotr Kazmierczak	cb8f4ea452	drivers: set -1 exit code in case executor gets killed (#25453 ) Nomad driver handles incorrectly set exit code 0 in case of executor failure. This corrects that behavior. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-20 15:06:39 +01:00
James Rasell	b3f28f9387	test: Use runtime CPUs for test not static number. (#25458 )	2025-03-20 09:05:36 +00:00
James Rasell	5a157eb123	server: Validate config num schedulers is between 0 and num CPUs. (#25441 ) The `server.num_scheduler` configuration value should be a value between 0 and the number of CPUs on the machine. The Nomad agent was not validating the configuration parameter which meant you could use a negative value or a value much larger than the available machine CPUs. This change enforces validation of the configuration value both on server startup and when the agent is reloaded. The Nomad API was only performing negative value validation when updating the scheduler number via this method. This change adds to the validation to ensure the number is not greater than the CPUs on the machine.	2025-03-20 07:29:57 +00:00
Juana De La Cuesta	220b53aba8	Merge pull request #25452 from hashicorp/NET-11546-enos-license Declare license input variables as sensitive	2025-03-19 20:55:16 +01:00
Juanadelacuesta	e0d3be81da	fix: declare license inputs as sensitive variables	2025-03-19 19:53:32 +01:00
Michael Smithhisler	d95a3766ae	client: fix client blocking during garbage collection (#25123 ) This change removes any blocking calls to destroyAllocRunner, which caused nomad clients to block when running allocations in certain scenarios. In addition, this change consolidates client GC by removing the MakeRoomFor method, which is redundant to keepUsageBelowThreshold. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-19 14:32:46 -04:00
Michael Smithhisler	4eb294e1ef	client: skip shutdown delay when tasks already deregistered (#25157 ) --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-19 14:15:35 -04:00
Aimee Ukasick	dae496e427	Docs: SEO front matter description for search: commands section (#25175 ) * Enhance front matter description for search * acl section * alloc section * config section * deployment section * eval section * job section * license section * namespace section * node section * node pool section * operator section * plugin section * quota section * recommendation section * scaling section * sentinel section * server section * service section * setup section * system section * tls section * var section * volume section * change reference to command reference * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2025-03-19 12:02:02 -05:00
Piotr Kazmierczak	e249a6197f	docker: TestDockerDriver_OOMKilled should now run on cgroups v2 (#25443 ) Docker driver's TestDockerDriver_OOMKilled should run on cgroups v2 now, since we're running docker v27 client library and our runners run docker v26 that contain containerd fix containerd/containerd#6323.	2025-03-19 16:53:37 +01:00
Phil Renaud	ce83993667	[ci/cd] Moves our default github action flows to use Node v20 (#25425 ) * Moves our default github action flows to use Node v20 * noop to trigger ui-build pipeline	2025-03-19 11:38:20 -04:00
Phil Renaud	3370d9cb96	[ui] Custom watchQuery equivalent on the storage index (#25374 ) * Custom watchQuery equivalent on the storage index * Tests for live updates to the storage page * Deconditionalizing the pagination on storage, and fixing a bug where I was looking at filtered but not paginated DHV * Test for pagination with live-updates	2025-03-19 11:38:01 -04:00
Tim Gross	13b95b7685	CSI: prevent extraneous GC attempts for plugins (#25432 ) We can't delete a CSI plugin when it has volumes in use. When periodic GC runs, we send the RPC unconditionally and then let the state store return an error. We accidentally fixed the excess logging this causes (#17025) in #20555, but we can also check if the plugin is empty first before sending the RPC to save a request and subsequent Raft write. Fixes: https://github.com/hashicorp/nomad/issues/17025 Ref: https://github.com/hashicorp/nomad/pull/20555	2025-03-19 09:14:42 -04:00
Shantanu Gadgil	b641d25730	website: fix URL for periodic jobs (#25436 )	2025-03-19 07:32:51 +00:00
Tim Gross	bf67f53ba2	docs: add note about Consul Enterprise role bindings and namespaces (#25426 ) When configuring Consul to use Nomad workload identities, you create the Consul auth method in the default namespace. If you're using Consul Enterprise namespaces, there are two available approaches: one is to create the tokens in the default namespace and give them policies that define cross-namespace access, and the other is to use binding rules that map the login to a particular namespace. The latter is what we show in our docs, but this was missing a note that any roles (and their associated policies) targetted by `-bind-type role` need to exist in the Consul namespace we're logging into. Also, in Nomad CE, the `consul.namespace` flag is always treated as having been set to `"default"`. That is, we ignore it and don't return an error even though it's a Nomad ENT-only feature. Clarify this in the documentation for the field the same way we've done for the `cluster` field. Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-03-18 15:35:00 -04:00
James Rasell	61b2b9d3d0	agent: Improve retry joiner code with small refactor. (#25422 ) The agent retry joiner implementation had different parameters to control its execution for agents running in server and client mode. The agent would set up individual joiners depending on the agent mode, making the object parameter overhead unrequired. This change removes the excess configuration options for the joiner, reducing code complexity slighly and hopefully making future modifications in this area easier to make.	2025-03-18 15:55:52 +00:00
Piotr Kazmierczak	94fbe30b47	build: smoke test on RHEL8 instead of RHEL7 (#25421 )	2025-03-18 15:41:23 +01:00

1 2 3 4 5 ...

26869 Commits