nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Allison Larson	fd16f80b5a	Only error on constraints if no allocs are running (#25850 ) * Only error on constraints if no allocs are running When running `nomad job run <JOB>` multiple times with constraints defined, there should be no error as a result of filtering out nodes that do not/have not ever satsified the constraints. When running a systems job with constraint, any run after an initial startup returns an exit(2) and a warning about unplaced allocations due to constraints. An error that is not encountered on the initial run, though the constraint stays the same. This is because the node that satisfies the condition is already running the allocation, and the placement is ignored. Another placement is attempted, but the only node(s) left are the ones that do not satisfy the constraint. Nomad views this case (no allocations that were attempted to placed could be placed successfully) as an error, and reports it as such. In reality, no allocations should be placed or updated in this case, but it should not be treated as an error. This change uses the `ignored` placements from diffSystemAlloc to attempt to determine if the case encountered is an error (no ignored placements means that nothing is already running, and is an error), or is not one (an ignored placement means that the task is already running somewhere on a node). It does this at the point where `failedTGAlloc` is populated, so placement functionality isn't changed, just the field that populates error. There is functionality that should be preserved which (correctly) notifies a user if a job is attempted that cannot be run on any node due to the constraints filtering out all available nodes. This should still behave as expected. * Add changelog entry * Handle in-place updates for constrained system jobs * Update .changelog/25850.txt Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com> * Remove conditionals --------- Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2025-05-15 15:14:03 -07:00
Tim Gross	9ee2582379	upgrade test: remove change mode from Vault workload (#25861 ) During the upgrade test we can trigger a re-render of the Vault secret due to client restart before the allocrunner has marked the task as running, which triggers the change mode on the template and restarts the task. This results in a race where the alloc is still "pending" when we go to check it. We never change the value of this secret in upgrade testing, so paper over this race condition by setting a "noop" change mode.	2025-05-15 10:10:58 -04:00
James Rasell	be84613dc3	test: Only run and lint Linux network hook test on Linux. (#25858 )	2025-05-15 13:33:37 +01:00
Martina Santangelo	18eddf53a4	commands: adds job start command to start stopped jobs (#24150 ) --------- Co-authored-by: Michael Smithhisler <michael.smithhisler@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-05-14 15:17:44 -04:00
Tim Gross	8a87c33594	build: pin actionlint workflow (#25855 ) We're required to pin Docker images for Actions to a specific SHA now and this is tripping scans in the Enterprise repo. Update the actionlint image. Ref: https://go.hashi.co/memo/sec-032	2025-05-14 14:25:37 -04:00
James Rasell	ef25c3d55a	cli: Fix help indentation format on node meta commands. (#25851 )	2025-05-14 14:53:48 +01:00
Tim Gross	8a5a057d88	offline license utilization reporting (#25844 ) Nomad Enterprise users operating in air-gapped or otherwise secured environments don't want to send license reporting metrics directly from their servers. Implement manual/offline reporting by periodically recording usage metrics snapshots in the state store, and providing an API and CLI by which cluster administrators can download the snapshot for review and out-of-band transmission to HashiCorp. This is the CE portion of the work required for implemention in the Enterprise product. Nomad CE does not perform utilization reporting. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2673 Ref: https://hashicorp.atlassian.net/browse/NMD-68 Ref: https://go.hashi.co/rfc/nmd-210	2025-05-14 09:51:13 -04:00
Aimee Ukasick	79d35f072a	Move environment section; CE-712 (#25845 )	2025-05-13 12:31:08 -05:00
Piotr Kazmierczak	57cd7d7bca	admin: Post 1.10.1 release (#25842 )	2025-05-13 14:46:47 +02:00
Tim Gross	6c9f2fdd29	reduce upgrade testing flakes (#25839 ) This changeset includes several adjustments to the upgrade testing scripts to reduce flakes and make problems more understandable: * When a node is drained prior to the 3rd client upgrade, it's entirely possible the 3rd client to be upgraded is the drained node. This results in miscounting the expected number of allocations because many of them will be "complete" (service/batch) or "pending" (system). Leave the system jobs running during drains and only count the running allocations at that point as the expected set. Move the inline script that gets this count into a script file for legibility. * When the last initial workload is deployed, it's possible for it to be briefly still in "pending" when we move to the next step. Poll for a short window for the expected count of jobs. * Make sure that any scripts that are being run right after a server or client is coming back up can handle temporary unavailability gracefully. * Change the debugging output of several scripts to avoid having the debug output run into the error message (Ex. "some allocs are not running" looked like the first allocation running was the missing allocation). * Add some notes to the README about running locally with `-dev` builds and tagging a cluster with your own name. Ref: https://hashicorp.atlassian.net/browse/NMD-162	2025-05-13 08:40:22 -04:00
Piotr Kazmierczak	1bbd9eb4b0	sentinel	2025-05-13 14:38:59 +02:00
Piotr Kazmierczak	c590c4dd3c	Merge release 1.10.1 files	2025-05-13 14:28:42 +02:00
hc-github-team-nomad-core	31b7a94a88	Prepare for next release	2025-05-13 14:26:48 +02:00
hc-github-team-nomad-core	9ef42e9807	Generate files for 1.10.1 release	2025-05-13 14:26:48 +02:00
Juana De La Cuesta	695ba2c159	Fix the verify alloc script (#25837 ) * fix: use the raw option on jq to avoid trating the " like a char * Update verify_allocs.sh	2025-05-12 14:53:28 +02:00
dependabot[bot]	120c7bd6e0	chore(deps): bump golang.org/x/sync from 0.13.0 to 0.14.0 (#25828 ) Bumps [golang.org/x/sync](https://github.com/golang/sync) from 0.13.0 to 0.14.0. - [Commits](https://github.com/golang/sync/compare/v0.13.0...v0.14.0) --- updated-dependencies: - dependency-name: golang.org/x/sync dependency-version: 0.14.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-12 11:13:10 +02:00
dependabot[bot]	6de7523de3	chore(deps): bump google.golang.org/grpc from 1.71.1 to 1.72.0 (#25767 ) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.71.1 to 1.72.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.71.1...v1.72.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-version: 1.72.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-12 10:51:22 +02:00
dependabot[bot]	a7ad560285	chore(deps): bump github.com/miekg/dns from 1.1.65 to 1.1.66 (#25829 ) Bumps [github.com/miekg/dns](https://github.com/miekg/dns) from 1.1.65 to 1.1.66. - [Changelog](https://github.com/miekg/dns/blob/master/Makefile.release) - [Commits](https://github.com/miekg/dns/compare/v1.1.65...v1.1.66) --- updated-dependencies: - dependency-name: github.com/miekg/dns dependency-version: 1.1.66 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-12 10:50:43 +02:00
dependabot[bot]	1d8b9c72a3	chore(deps): bump golang.org/x/sys from 0.32.0 to 0.33.0 (#25830 ) Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.32.0 to 0.33.0. - [Commits](https://github.com/golang/sys/compare/v0.32.0...v0.33.0) --- updated-dependencies: - dependency-name: golang.org/x/sys dependency-version: 0.33.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-12 10:13:43 +02:00
dependabot[bot]	e6d90104c5	chore(deps): bump github.com/hashicorp/consul/api from 1.30.0 to 1.32.1 (#25831 ) Bumps [github.com/hashicorp/consul/api](https://github.com/hashicorp/consul) from 1.30.0 to 1.32.1. - [Release notes](https://github.com/hashicorp/consul/releases) - [Changelog](https://github.com/hashicorp/consul/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/consul/compare/api/v1.30.0...api/v1.32.1) --- updated-dependencies: - dependency-name: github.com/hashicorp/consul/api dependency-version: 1.32.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-12 10:13:07 +02:00
James Rasell	0b265d2417	encrypter: Track initial tasks for is ready calculation. (#25803 ) The server startup could "hang" to the view of an operator if it had a key that could not be decrypted or replicated loaded from the FSM at startup. In order to prevent this happening, the server startup function will now use a timeout to wait for the encrypter to be ready. If the timeout is reached, the error is sent back to the caller which fails the CLI command. This bubbling of error message will also flush to logs which will provide addition operator feedback. The server only cares about keys loaded from the FSM snapshot and trailing logs before the encrypter should be classed as ready. So that the encrypter ready function does not get blocked by keys added outside of the initial Raft load, we take a snapshot of the decryption tasks as we enter the blocking call, and class these as our barrier.	2025-05-07 15:38:16 +01:00
Tim Gross	3690a0118e	build: update go toolchain to 1.24.3 (#25818 )	2025-05-07 09:57:31 -04:00
James Rasell	296d03d9dd	encrypter: Remove tracking of cancelation for decrypt tasks. (#25795 ) New wrapped keys were added to the encrypter and tracked using their keyID with the context cancelation function. This tracking was performed primarily so the FSM could load its known key objects and logs with entries for the same ID superseding existing decryption tasks. This is a hard to reason about approach and in theory can cause timing problems in conjunction with the locking. The new approach still tracks decryption tasks but does not store the cancelation context. This context is now controlled within a single function in an attempt to provide a clearer workflow. In the event two calls for the same key are made in close succession meaning there is no entry in the keyring for the key yet, all tasks will be launched. The first-past-the-post will write the cipher to encrypter state, the second task will complete but not write the cipher.	2025-05-07 14:35:24 +01:00
Juana De La Cuesta	cb09696b1c	Nojira upgrade3 (#25817 ) * fix: typo * fix: correct the script for unbound var * fix: typo * fix: typo	2025-05-06 18:21:33 +02:00
Juana De La Cuesta	f68203549b	Fix the verify allocs, missing `echo` (#25816 ) * fix: typo * fix: correct the script for unbound var * fix: typo	2025-05-06 17:16:56 +02:00
Juana De La Cuesta	42d4067d55	Nojira upgrade3 (#25815 ) * fix: typo * fix: correct the script for unbound var	2025-05-06 16:57:44 +02:00
Juana De La Cuesta	da0ea9935d	fix: typo (#25814 )	2025-05-06 16:44:25 +02:00
Juana De La Cuesta	22921418b6	Check for allocs running before checking for IDs after a client upgrade (#25790 ) * fix: wait for all allocs to be running before checking for their IDs after client upgrade * style: linter fix * fix: filter running allocs per client ID when checking for allocs after upgrade	2025-05-06 16:22:45 +02:00
dependabot[bot]	242ee16c81	chore(deps): bump github.com/docker/docker (#25810 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.0.4+incompatible to 28.1.1+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v28.0.4...v28.1.1) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-version: 28.1.1+incompatible dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-05 09:04:32 -04:00
Tim Gross	da592ab1b7	testing: fix vault setup test's reliance on specific Raft index (#25806 ) The test for `nomad setup vault` command expects a specific `CreateIndex` for the job it creates. Any Raft write when a server comes up or establishes leadership can cause this test to break. Interpolate the expected index as we've done for other indexes on the job to make this test less brittle. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2673#issuecomment-2847619747	2025-05-02 14:30:10 -04:00
James Rasell	21fd0bbb8a	ci: Regenerate TLS certificates used for testing. (#25804 )	2025-05-02 13:51:02 +01:00
James Rasell	449da5bc11	deps: Update mitchellh/colorstring to d06e56a500db (#25801 )	2025-05-02 11:30:41 +01:00
James Rasell	01cd762d27	encrypter: Ignore wrapped key additions with zero wrapped keys. (#25791 ) When a Nomad server restores its state via a snapshot and logs, it is possible a legacy wrapped key object/log is found. This key will not contain any wrapped keys and therefore should be ingored within the encrypter. It is theoretically possible without this change that a key which generates zero decrypt tasks supersedes a running task and will place itself in the tracked decrypt task tracker. This decrypt task has no running work to remove its entry.	2025-05-01 14:54:41 +01:00
Juana De La Cuesta	dfc1412e22	Merge pull request #25721 from hashicorp/NMD-321-reload Force an agent return if there is an error on reload	2025-05-01 14:43:08 +02:00
dependabot[bot]	f54804c16b	chore(deps): bump github.com/miekg/dns from 1.1.64 to 1.1.65 (#25766 )	2025-05-01 07:40:09 +01:00
Chris Roberts	a69baeea8c	Merge pull request #25792 from hashicorp/b-pagination-tkn paginator: fix tokenizer comparison of composite index and ID	2025-04-30 13:14:11 -07:00
Chris Roberts	ba1683f40e	Update wording in the changelog entry	2025-04-30 11:17:19 -07:00
Chris Roberts	db360fc085	paginator: fix tokenizer comparison of composite index and ID The `CreateIndexAndIDTokenizer` creates a composite token by combining the create index value and ID from the object with a `.`. Tokens are then compared lexicographically. The comparison is appropriate for the ID segment of the token, but it is not for the create index segement. Since the create index values are stored with numeric ordering, using a lexicographical comparison can cause unexpected results. For example, when comparing the token `12.object-id` to `102.object-id` the result will show `12.object-id` being greater. This is the correct comparison but it is incorrect for the intention of the token. With the knowledge of the composition of the token, the response should be that `12.object-id` is less. The unexpected behavior can be seen when performing lists (like listing allocations). The behavior is encountered inconsistently due to two requirements which must be met: 1. Create index values with a large enough span (ex: 12 and 102) 2. Correct per page value to get a "bad" next token (ex: prefix with 102) To prevent the unexpected behavior, the target token is split and the components are used individually to compare against the object. Fixes #25435	2025-04-30 09:51:24 -07:00
Juana De La Cuesta	dcaa96f0e5	Update website/content/docs/upgrade/upgrade-specific.mdx Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-30 15:03:49 +02:00
zouyu1026	18e508ff05	Update index.mdx (#25755 ) the old link https://caravanproject.io/ point to Gambling website. update the github wiki	2025-04-30 06:22:29 -05:00
Juana De La Cuesta	e8fb36f4d3	Style: typo	2025-04-30 13:01:57 +02:00
Juanadelacuesta	9288a3141a	func and docs: Use the config from the client and not from the agent that is already parsed. Add the breaking change to the release notes	2025-04-30 10:53:02 +02:00
Tu Nguyen	bee2400958	update iframe to videoembed (#25783 )	2025-04-29 10:58:04 -05:00
Aimee Ukasick	4075b0b8ba	Docs: Add garbage collection page (#25715 ) * add garbage collection page * finish client; add resources section * finish server section; task driver section * add front matter description * fix typos * Address Tim's feedback	2025-04-28 08:37:23 -05:00
Adrian Todorov	a4dd1c962e	docs: Update Nvidia device driver docs to link to list of supported cards and newer versions (#25531 ) Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-28 08:32:58 +01:00
dependabot[bot]	4be69dddd4	chore(deps): bump github.com/hashicorp/vault/api from 1.15.0 to 1.16.0 (#25763 )	2025-04-28 07:49:49 +01:00
dependabot[bot]	71065af720	chore(deps): bump github.com/hashicorp/go-discover (#25764 )	2025-04-28 07:14:28 +01:00
Piotr Kazmierczak	3e688cf928	acl: add missing JWT auth method validation (#25757 )	2025-04-25 14:53:25 +02:00
Piotr Kazmierczak	32ca833c70	client: unflake TestClient_ACL_ResolveToken_InvalidClaims (#25758 )	2025-04-25 14:53:09 +02:00
James Rasell	e928131482	ui: Only show paused icon when allocs in pending state are paused. (#25742 ) Jobs were being marked incorectly as having paused allocations when termimal allocations were marked with the paused boolean. The UI should only mark a job as including paused allocations when these paused allocations are in the correct client state, which is pending. --------- Co-authored-by: Phil Renaud <phil@riotindustries.com>	2025-04-25 07:45:45 +01:00

1 2 3 4 5 ...

27077 Commits