nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Piotr Kazmierczak	0e8a67f0e1	docker: oom_score_adj support (#23297 )	2024-06-12 10:49:20 +02:00
Matt McQuillan	7f1665d326	Merge pull request #23286 from hashicorp/mmcquillan/jiraworkflow Adding GHA workflow to sync with Jira	2024-06-11 12:52:57 -04:00
Tim Gross	44078d4786	docs: update configuration docs to include trace-level logging (#23285 )	2024-06-11 09:19:52 -04:00
Tim Gross	7d73065066	numa: fix scheduler panic due to topology serialization bug (#23284 ) The NUMA topology struct field `NodeIDs` is a `idset.Set`, which has no public members. As a result, this field is never serialized via msgpack and persisted in state. When `numa.affinity = "prefer"`, the scheduler dereferences this nil field and panics the scheduler worker. Ideally we would fix this by adding a msgpack serialization extension, but because the field already exists and is just always empty, this breaks RPC wire compatibility across upgrades. Instead, create a new field that's populated at the same time we populate the more useful `idset.Set`, and repopulate the set on demand. Fixes: https://hashicorp.atlassian.net/browse/NET-9924	2024-06-11 08:55:00 -04:00
Tim Gross	288a048a2e	e2e: add prerelease builds to Consul/Vault compatibility tests (#23287 ) Update the Consul/Vault build downloader functions so that we include the current prerelease build (if any) in our E2E compatibility testing we do on each PR. This will automatically cycle out when the GA build is released, because that build is "higher" in the sorted set.	2024-06-11 08:54:27 -04:00
Tim Gross	61608e43cb	test: move NUMA platform scan out of testing global (#23289 ) The `testing.go` test helpers file for the driver manager initializes the NUMA scan as a package-global variable. This causes it to be pulled in even in production builds, so even running commands like `nomad version` will cause the NUMA scan to happen. Move the scan into the test helper setup.	2024-06-11 08:52:51 -04:00
James Rasell	00570d221b	docs: update ACL policy example spec to remove plugin write cap. (#23277 )	2024-06-11 07:44:27 +01:00
Matt McQuillan	55edc0289a	Update .github/workflows/jira-sync.yml From linting, quoting the env var Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-06-10 15:16:58 -04:00
Matt McQuillan	6af76c02d1	Adding GHA workflow to sync with Jira	2024-06-10 13:44:49 -04:00
James Rasell	d2a03ded78	acl: fix validation of ACL plugin policy entries. (#23274 )	2024-06-10 16:17:51 +01:00
Tim Gross	fa70267787	scheduler: `RescheduleTracker` dropped if follow-up fails placements (#12319 ) When an allocation fails it triggers an evaluation. The evaluation is processed and the scheduler sees it needs to reschedule, which triggers a follow-up eval. The follow-up eval creates a plan to `(stop 1) (place 1)`. The replacement alloc has a `RescheduleTracker` (or gets its `RescheduleTracker` updated). But in the case where the follow-up eval can't place all allocs (there aren't enough resources), it can create a partial plan to `(stop 1) (place 0)`. It then creates a blocked eval. The plan applier stops the failed alloc. Then when the blocked eval is processed, the job is missing an allocation, so the scheduler creates a new allocation. This allocation is _not_ a replacement from the perspective of the scheduler, so it's not handed off a `RescheduleTracker`. This changeset fixes this by annotating the reschedule tracker whenever the scheduler can't place a replacement allocation. We check this annotation for allocations that have the `stop` desired status when filtering out allocations to pass to the reschedule tracker. I've also included tests that cover this case and expands coverage of the relevant area of the code. Fixes: https://github.com/hashicorp/nomad/issues/12147 Fixes: https://github.com/hashicorp/nomad/issues/17072	2024-06-10 11:15:40 -04:00
nicoche	ffcb72bfe3	api: Add Notes field to service checks (#22397 ) Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2024-06-10 16:59:49 +02:00
James Rasell	1c976d126e	docs: update snapshot inspect CLI detail to mirror recent changes. (#23276 )	2024-06-10 14:30:13 +01:00
Phil Renaud	a933292897	Sanitize params input to alert a user when their scenario is invalid (#23261 )	2024-06-07 11:12:21 -04:00
Seth Hoenig	45da80bde2	client: cleanup empty task directory when using unveil filesystem isolation (#23237 ) This PR fixes a bug where Nomad client would leave behind an empty directory created on behalf of tasks making use of the unveil filesystem isolation mode (i.e. using exec2 task driver). Once unmounting is complete, we should remember to also delete the directory. Fixes #22433	2024-06-06 10:47:23 -05:00
Tim Gross	71fd5c2474	testing: pull Docker images from mirror (#23190 ) In https://github.com/hashicorp/nomad/pull/17401 we added test helpers that would allow `docker` driver tests to pull from a mirror of the Docker Hub registry. Extend the use of this helper a test that recently hit rate-limiting. Fixes: https://github.com/hashicorp/nomad/issues/23174	2024-06-06 11:21:45 -04:00
Gerard Nguyen	c3c2240304	Update nomad operator snapshot inspect with more detail (#20062 ) Co-authored-by: Michael Schurter <michael.schurter@gmail.com> Co-authored-by: James Rasell <jrasell@hashicorp.com>	2024-06-06 06:57:10 +01:00
Tim Gross	34f34440ac	build: remove 32-bit ARM builds (#23189 ) We no longer intend to release 32-bit builds for any platform. We'd previously removed the builds for i386 on both Linux and Windows, but never got around to removing the ARM builds. Add a note about this deprecation in the release notes for 1.8.x.	2024-06-05 15:47:20 -04:00
Tim Gross	17093d62f0	docs: describe omitted `spread` behavior and perf impact (#23184 ) Update the documentation for the `spread` block: * Make it clear that the default behavior within a given job when the `spread` block is omitted is to spread out allocs among feasible nodes. * Describe the difference between the `spread` block and `spread` scheduler algorithm. * Add warnings about the performance impact of using `spread` and how to mitigate it.	2024-06-05 13:28:09 -04:00
Piotr Kazmierczak	abc6fe325d	docs: fix typo in nomad quota utilization metrics (#23185 )	2024-06-05 16:20:44 +02:00
Tim Gross	c99428d553	build: update to go1.22.4 (#23172 ) Update Go toolchain to 1.22.4, which addresses two vulnerabilities in the Go stdlib. * CVE-2024-24789: impacts handling of certain types of invalid zip files, which could be exploited to create a zip file with unexpected contents. This could potentially impact Nomad users of `artifact` blocks who download untrusted artifacts. * CVE-2024-24790: impacts parsing of IPv4-mapped IPv6 addresses.	2024-06-05 09:03:15 -04:00
Will Owens	e6bf43e825	jobspec2: add test for parsing contraint alternates (#23175 )	2024-06-05 09:02:39 -04:00
Tim Gross	39dee90ad4	docs: clarify node drain behavior for batch workloads (#23170 ) Our documentation for the `node drain` command doesn't include a treatment of batch jobs, which are not migrated. The user is left to piece this behavior together from the `migrate` documentation and the tutorial. Instead, let's explicitly list the behaviors per job type. Fixes: https://github.com/hashicorp/nomad/issues/17563	2024-06-05 08:47:37 -04:00
Tim Gross	67967c99a7	scheduler: stack test should use job.ID and not job.Name (#23169 ) Some of our scheduler tests use the `AllocName` function from the structs package incorrectly. This function should always receive the `Job.ID` and not the `Job.Name`. Fix this to prevent future bugs from copy-pasting usage around.	2024-06-05 08:34:04 -04:00
Charlie Voiselle	74d8bc5d01	Updating hashicorp/vault-action to v3.0.0 (#23171 ) Removes a dependency on a node 16 action, which are EOLed.	2024-06-04 15:40:18 -04:00
Seth Hoenig	d9416afee5	testing: fix the value of NOMAD_SECRETS_DIR in test harness (#23166 ) This PR fixes the value of NOMAD_SECRETS_DIR to be the alloc_mounts secrets directory instead of the real secrets directory, which is protected by root 0700 even when running tests. Needed for https://github.com/hashicorp/nomad-driver-exec2/issues/29	2024-06-04 10:58:10 -05:00
James Rasell	e73d8bb114	docs: update exec2 install apt/yum commands for pre-release. (#22428 )	2024-06-04 14:41:57 +01:00
Ryan R Sundberg	096c72a2f4	Consul Connect: Fix validation with multiple local_bind_socket_paths (#22312 ) When a consul connect sidecar service is defined with multiple local_bind_socket_path upstreams, validation would fail due to duplicate socket address bindings on `:0` being detected. Validate local_bind_socket_path sockets separately from IP address sockets.	2024-06-04 08:46:24 -04:00
dependabot[bot]	13e1a72325	build(deps): bump minimatch in /scripts/screenshots/src (#15353 ) Bumps [minimatch](https://github.com/isaacs/minimatch) from 3.0.4 to 3.1.2. - [Release notes](https://github.com/isaacs/minimatch/releases) - [Commits](https://github.com/isaacs/minimatch/compare/v3.0.4...v3.1.2) --- updated-dependencies: - dependency-name: minimatch dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-03 16:39:39 -04:00
Piotr Kazmierczak	2a09abc477	metrics: quota utilization configuration and documentation (#22912 ) Introduces support for (optional) quota utilization metrics CE part of the hashicorp/nomad-enterprise#1488 change	2024-06-03 21:06:19 +02:00
Charlie Voiselle	180bab892d	Update hcl/v2 to latest patched version v2.20.2-0.20240517235513-55d9c02d147d (#22439 )	2024-05-31 15:42:17 -04:00
Phil Renaud	784ec507b8	Omit the current-time-displaying components during our visual diff tests (#22435 )	2024-05-31 13:41:26 -04:00
Phil Renaud	ddfadca618	Checking for the type of event param before executing a lazy click (#22429 )	2024-05-31 13:24:22 -04:00
Phil Renaud	014f5145dc	Lockfile and bindata_assetfs recompiled on latest main (#22434 )	2024-05-31 13:23:59 -04:00
Phil Renaud	36c2439503	[ui] Tests for Sentinel Policies (#22398 ) * Tests for Sentinel Policies UI * Further sentinel tests * job allocations test reinstated	2024-05-31 10:38:54 -04:00
Seth Hoenig	2054e87158	e2e: add tests for exec2 task driver (#22406 ) * e2e: add tests for exec2 task driver * e2e: use envoy 1.29.4 because consul * e2e: add a bridge networking http test for exec driver * e2e: split up http test so curl always starts after the server	2024-05-31 09:22:39 -05:00
Phil Renaud	86ee56b8c5	[ui] Jobs index page badge for when a job has a paused task (#22392 ) * Adds a badge on the jobs index page if any task within any allocation of a running job is currently paused * Snapshot and acceptance tests for paused states * Cleared yarn cache * Remove MirageScenario from the test dependency chain * Logging before toString * Cardinal sin of time-based test execution * Maybe weve been lucky for years and the clientStatus has always been running for this test by happenstance * Back away from the time-based and toward the settled() approach	2024-05-30 21:18:35 -04:00
Piotr Kazmierczak	307fd590d7	docker: new container_exists_attempts configuration field (#22419 ) This allows users to set a custom value of attempts that will be made to purge an existing (not running) container if one is found during task creation. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-05-30 19:22:14 +02:00
Piotr Kazmierczak	bf11e39ac8	docker: add a unit test for "container already exists" error when creating containers (#22238 )	2024-05-30 11:24:28 +02:00
James Rasell	6cb9bed236	docs: add operations benchmarking page with nomad-bench link. (#22393 )	2024-05-30 07:34:10 +01:00
Phil Renaud	1412e65bbd	[ui] Dropdowns on the jobs index page get a max-height and filtering (#20626 ) * Adds a max-height to dropdowns lest they get any funny ideas * Filter filtering	2024-05-29 21:01:57 -04:00
David Yu	5f0dea189e	Merge pull request #22411 from hashicorp/docs-tbte docs: add docs for time based task execution	2024-05-29 16:23:57 -07:00
Michael Schurter	7048d3a482	link release notes to schedule block	2024-05-29 15:53:15 -07:00
Michael Schurter	a2fe43030c	rap	2024-05-29 15:50:33 -07:00
Michael Schurter	5a0c74d1f9	Apply suggestions from code review Co-authored-by: David Yu <dyu@hashicorp.com>	2024-05-29 15:50:33 -07:00
Michael Schurter	fe0bda9c34	speling	2024-05-29 15:50:33 -07:00
Michael Schurter	690abefc4a	docs: add docs for time based task execution	2024-05-29 15:50:33 -07:00
Phil Renaud	e09b29113c	[ui] Helios and Power Select upgrades (#22328 ) * Helios and Power Select upgrades * Renamed namespaced contextual components	2024-05-29 17:00:56 -04:00
Phil Renaud	8a9d58ae8f	Storybook scripts and references removed (#22232 )	2024-05-29 16:34:26 -04:00
Tim Gross	140747240f	consul: include admin partition in JWT login requests (#22226 ) When logging into a JWT auth method, we need to explicitly supply the Consul admin partition if the local Consul agent is in a partition. We can't derive this from agent configuration because the Consul agent's configuration is canonical, so instead we get the partition from the fingerprint (if available). This changeset updates the Consul client constructor so that we close over the partition from the fingerprint. Ref: https://hashicorp.atlassian.net/browse/NET-9451	2024-05-29 16:31:09 -04:00

1 2 3 4 5 ...

25881 Commits