nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Tim Gross	3690a0118e	build: update go toolchain to 1.24.3 (#25818 )	2025-05-07 09:57:31 -04:00
James Rasell	296d03d9dd	encrypter: Remove tracking of cancelation for decrypt tasks. (#25795 ) New wrapped keys were added to the encrypter and tracked using their keyID with the context cancelation function. This tracking was performed primarily so the FSM could load its known key objects and logs with entries for the same ID superseding existing decryption tasks. This is a hard to reason about approach and in theory can cause timing problems in conjunction with the locking. The new approach still tracks decryption tasks but does not store the cancelation context. This context is now controlled within a single function in an attempt to provide a clearer workflow. In the event two calls for the same key are made in close succession meaning there is no entry in the keyring for the key yet, all tasks will be launched. The first-past-the-post will write the cipher to encrypter state, the second task will complete but not write the cipher.	2025-05-07 14:35:24 +01:00
Juana De La Cuesta	cb09696b1c	Nojira upgrade3 (#25817 ) * fix: typo * fix: correct the script for unbound var * fix: typo * fix: typo	2025-05-06 18:21:33 +02:00
Juana De La Cuesta	f68203549b	Fix the verify allocs, missing `echo` (#25816 ) * fix: typo * fix: correct the script for unbound var * fix: typo	2025-05-06 17:16:56 +02:00
Juana De La Cuesta	42d4067d55	Nojira upgrade3 (#25815 ) * fix: typo * fix: correct the script for unbound var	2025-05-06 16:57:44 +02:00
Juana De La Cuesta	da0ea9935d	fix: typo (#25814 )	2025-05-06 16:44:25 +02:00
Juana De La Cuesta	22921418b6	Check for allocs running before checking for IDs after a client upgrade (#25790 ) * fix: wait for all allocs to be running before checking for their IDs after client upgrade * style: linter fix * fix: filter running allocs per client ID when checking for allocs after upgrade	2025-05-06 16:22:45 +02:00
dependabot[bot]	242ee16c81	chore(deps): bump github.com/docker/docker (#25810 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.0.4+incompatible to 28.1.1+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v28.0.4...v28.1.1) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-version: 28.1.1+incompatible dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-05 09:04:32 -04:00
Tim Gross	da592ab1b7	testing: fix vault setup test's reliance on specific Raft index (#25806 ) The test for `nomad setup vault` command expects a specific `CreateIndex` for the job it creates. Any Raft write when a server comes up or establishes leadership can cause this test to break. Interpolate the expected index as we've done for other indexes on the job to make this test less brittle. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2673#issuecomment-2847619747	2025-05-02 14:30:10 -04:00
James Rasell	21fd0bbb8a	ci: Regenerate TLS certificates used for testing. (#25804 )	2025-05-02 13:51:02 +01:00
James Rasell	449da5bc11	deps: Update mitchellh/colorstring to d06e56a500db (#25801 )	2025-05-02 11:30:41 +01:00
James Rasell	01cd762d27	encrypter: Ignore wrapped key additions with zero wrapped keys. (#25791 ) When a Nomad server restores its state via a snapshot and logs, it is possible a legacy wrapped key object/log is found. This key will not contain any wrapped keys and therefore should be ingored within the encrypter. It is theoretically possible without this change that a key which generates zero decrypt tasks supersedes a running task and will place itself in the tracked decrypt task tracker. This decrypt task has no running work to remove its entry.	2025-05-01 14:54:41 +01:00
Juana De La Cuesta	dfc1412e22	Merge pull request #25721 from hashicorp/NMD-321-reload Force an agent return if there is an error on reload	2025-05-01 14:43:08 +02:00
dependabot[bot]	f54804c16b	chore(deps): bump github.com/miekg/dns from 1.1.64 to 1.1.65 (#25766 )	2025-05-01 07:40:09 +01:00
Chris Roberts	a69baeea8c	Merge pull request #25792 from hashicorp/b-pagination-tkn paginator: fix tokenizer comparison of composite index and ID	2025-04-30 13:14:11 -07:00
Chris Roberts	ba1683f40e	Update wording in the changelog entry	2025-04-30 11:17:19 -07:00
Chris Roberts	db360fc085	paginator: fix tokenizer comparison of composite index and ID The `CreateIndexAndIDTokenizer` creates a composite token by combining the create index value and ID from the object with a `.`. Tokens are then compared lexicographically. The comparison is appropriate for the ID segment of the token, but it is not for the create index segement. Since the create index values are stored with numeric ordering, using a lexicographical comparison can cause unexpected results. For example, when comparing the token `12.object-id` to `102.object-id` the result will show `12.object-id` being greater. This is the correct comparison but it is incorrect for the intention of the token. With the knowledge of the composition of the token, the response should be that `12.object-id` is less. The unexpected behavior can be seen when performing lists (like listing allocations). The behavior is encountered inconsistently due to two requirements which must be met: 1. Create index values with a large enough span (ex: 12 and 102) 2. Correct per page value to get a "bad" next token (ex: prefix with 102) To prevent the unexpected behavior, the target token is split and the components are used individually to compare against the object. Fixes #25435	2025-04-30 09:51:24 -07:00
Juana De La Cuesta	dcaa96f0e5	Update website/content/docs/upgrade/upgrade-specific.mdx Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-30 15:03:49 +02:00
zouyu1026	18e508ff05	Update index.mdx (#25755 ) the old link https://caravanproject.io/ point to Gambling website. update the github wiki	2025-04-30 06:22:29 -05:00
Juana De La Cuesta	e8fb36f4d3	Style: typo	2025-04-30 13:01:57 +02:00
Juanadelacuesta	9288a3141a	func and docs: Use the config from the client and not from the agent that is already parsed. Add the breaking change to the release notes	2025-04-30 10:53:02 +02:00
Tu Nguyen	bee2400958	update iframe to videoembed (#25783 )	2025-04-29 10:58:04 -05:00
Aimee Ukasick	4075b0b8ba	Docs: Add garbage collection page (#25715 ) * add garbage collection page * finish client; add resources section * finish server section; task driver section * add front matter description * fix typos * Address Tim's feedback	2025-04-28 08:37:23 -05:00
Adrian Todorov	a4dd1c962e	docs: Update Nvidia device driver docs to link to list of supported cards and newer versions (#25531 ) Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-28 08:32:58 +01:00
dependabot[bot]	4be69dddd4	chore(deps): bump github.com/hashicorp/vault/api from 1.15.0 to 1.16.0 (#25763 )	2025-04-28 07:49:49 +01:00
dependabot[bot]	71065af720	chore(deps): bump github.com/hashicorp/go-discover (#25764 )	2025-04-28 07:14:28 +01:00
Piotr Kazmierczak	3e688cf928	acl: add missing JWT auth method validation (#25757 )	2025-04-25 14:53:25 +02:00
Piotr Kazmierczak	32ca833c70	client: unflake TestClient_ACL_ResolveToken_InvalidClaims (#25758 )	2025-04-25 14:53:09 +02:00
James Rasell	e928131482	ui: Only show paused icon when allocs in pending state are paused. (#25742 ) Jobs were being marked incorectly as having paused allocations when termimal allocations were marked with the paused boolean. The UI should only mark a job as including paused allocations when these paused allocations are in the correct client state, which is pending. --------- Co-authored-by: Phil Renaud <phil@riotindustries.com>	2025-04-25 07:45:45 +01:00
Tim Gross	374e987b9b	metrics: emit cache and rss stats on cgroup v2 (#25751 ) In cgroups v2, a different map of memory stats is available from the kernel than in v1. The Docker API reflects this change. But there are equivalent values in the map for RSS (anonymously mapped memory) and cache (filesystem cache and tmpfs), which the Docker driver is not currently emitting. Fallback to these alternate values when the cgroups v1 values are not available. Include the anonymous mapping in the "measured" allocation stats as "RSS" so that they both show up in allocation metrics. We can do this on both the `docker` driver and the Linux executor for `exec` and `java` drivers. Fixes: https://github.com/hashicorp/nomad/issues/19185 Ref: https://hashicorp.atlassian.net/browse/NMD-437 Ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files Ref: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt	2025-04-24 12:48:18 -04:00
Matt McQuillan	bd12d55eae	Jira - GH Sync Updates updating fields for jira move/sync	2025-04-24 11:47:31 -04:00
Matt McQuillan	74b98a6e9b	updating fields for jira move/sync	2025-04-24 11:27:42 -04:00
Tim Gross	c7cb49f205	testing: fix a panic in docker stats collection test (#25747 ) When the context closes, the stats emitter closes its channel. It's possible for the channel to be closed in the stats emitter goroutine before the `select` in the test sees that the context has closed, which can result in a panic in the test when we try to read the empty value off the channel.	2025-04-24 10:41:03 -04:00
Tim Gross	1e744db38e	refactor alloc drain to make intent more clear (#25731 ) While working on #25726, I found a method in the drainer code that records creates a map of job IDs to allocations. At first glance this looks like a bug because it effectively de-duplicates the allocations per job. But the consumer of the map is only concerned with jobs, not allocations, and simply reads the job off the allocation. Refactor this to make it obvious we're looking at the job. Ref: https://github.com/hashicorp/nomad/pull/25726	2025-04-24 09:54:44 -04:00
Tim Gross	5208ad4c2c	scheduler: allow canaries to be migrated on node drain (#25726 ) When a node is drained that has canaries that are not yet healthy, the canaries may not be properly migrated and the deployment will halt. This happens only if there are more than `migrate.max_parallel` canaries on the node and the canaries are not yet healthy (ex. they have a long `update.min_healthy_time`). In this circumstance, the first batch of canaries are marked for migration by the drainer correctly. But then the reconciler counts these migrated canaries against the total number of expected canaries and no longer progresses the deployment. Because an insufficient number of allocations have reported they're healthy, the deployment cannot be promoted. When the reconciler looks for canaries to cancel, it leaves in the list any canaries that are already terminal (because there shouldn't be any work to do). But this ends up skipping the creation of a new canary to replace terminal canaries that have been marked for migration. Add a conditional for this case to cause the canary to be removed from the list of active canaries so we can replace it. Ref: https://hashicorp.atlassian.net/browse/NMD-560 Fixes: https://github.com/hashicorp/nomad/issues/17842	2025-04-24 09:24:28 -04:00
Piotr Kazmierczak	3ad0df71a8	docker: correct stat response for rss, cache and swap memory in cgroups v1 (#25741 ) #25138 refactoring accidentally removed some of the memory stats that weren't available as concrete types in containerapi.	2025-04-24 15:17:56 +02:00
Tim Gross	4d7ed88a8d	testing: use Docker Hub registry mirror for additional tests (#25733 ) This image was missed in https://github.com/hashicorp/nomad/pull/25703 and is resulting in rate limited in tests.	2025-04-24 08:50:32 -04:00
James Rasell	4b40e10e68	e2e: Update UI playwright version to 1.52.0 (#25740 )	2025-04-24 13:38:26 +01:00
James Rasell	717207bce0	e2e: Fix TestDocker/testRedis with increased timeout on deployment (#25739 ) The fresh deployment of the Redis job took around 20s which is also the default context timeout on the e2e util that monitors and waits for a deployment to complete. The tight timing meant the test often timed out but sometimes would complete successfully. Increasing the timeout for this deployment will remove the flakiness.	2025-04-24 09:09:33 +01:00
Juanadelacuesta	949571e313	func: read the config from the agent, dont reparse	2025-04-24 05:01:53 +02:00
Juana De La Cuesta	4b95517734	Update .changelog/25721.txt Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2025-04-24 04:54:38 +02:00
Juanadelacuesta	46343ee56e	func: use the client's configured drain deadline to calculate the graceful timeout when terminating an agent	2025-04-23 23:59:50 +02:00
Juanadelacuesta	c91f24681d	style: add changelog	2025-04-23 23:28:54 +02:00
Juana De La Cuesta	9778a31e29	Update command/agent/command.go Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-04-23 23:18:09 +02:00
Juana De La Cuesta	39b3d63172	Update command/agent/command.go Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-04-23 23:18:02 +02:00
Juana De La Cuesta	313f430fdd	Update command/agent/command.go Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2025-04-23 23:17:36 +02:00
Matt McQuillan	9a30372426	Testing Revised Jira Fields to get Jira/GH integration working Testing Revised Jira Fields	2025-04-23 16:12:58 -04:00
Matt McQuillan	2b437fd733	Fixing ordering and ending bracket of extraFields	2025-04-23 15:59:27 -04:00
Matt McQuillan	1754fb1ed8	Update .github/workflows/jira-sync.yml Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-04-23 15:53:35 -04:00
Matt McQuillan	d9b0fdcb8e	Testing Revised Jira Fields	2025-04-23 15:39:58 -04:00

1 2 3 4 5 ...

27056 Commits