nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Tim Gross	b8b95eb918	docs: warn against enabling Prometheus metrics if not in use (#26560 ) The go-metrics library retains Prometheus metrics in memory until expiration, but the expiration logic requires that the metrics are being regularly scraped. If you don't have a Prometheus server scraping, this leads to ever-increasing memory usage. In particular, high volume dispatch workloads emit a large set of label values and if these are not eventually aged out the bulk of Nomad server memory can end up consumed by metrics.	2025-08-19 08:44:16 -04:00
James Rasell	d13abe376c	test: Deflake client agent endpoint test. (#26563 )	2025-08-19 08:57:55 +01:00
James Rasell	3b0b7db1a1	client: Add client identity API, CLI, and RPC workflow. (#26543 ) The Nomad clients store their Nomad identity in memory and within their state store. While active, it is not possible to dump the state to view the stored identity token, so having a way to view the current claims while running aids debugging and operations. This change adds a client identity workflow, allowing operators to view the current claims of the nodes identity. It does not return any of the signing key material.	2025-08-19 08:25:51 +01:00
Daniel Bennett	f3e08d8aa9	e2e: exec2: envoy binary version and tidying (#26558 ) * e2e: update standalone envoy binary version fix for: > === FAIL: e2e/exec2 TestExec2/testCountdash (21.25s) > exec2_test.go:71: > ... > [warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:155] DeltaAggregatedResources gRPC config stream to local_agent closed: 3, Envoy 1.29.4 is too old and is not supported by Consul there's also this warning, but it doesn't seem so fatal: > [warning][main] [source/server/server.cc:910] There is no configured limit to the number of allowed active downstream connections. Configure a limit in `envoy.resource_monitors.downstream_connections` resource monitor. picked latest supported from latest consul (1.21.4): ``` $ curl -s localhost:8500/v1/agent/self \| jq .xDS.SupportedProxies { "envoy": [ "1.34.1", "1.33.2", "1.32.5", "1.31.8" ] } ``` * e2e: exec2: remove extraneous bits * reschedule: no reschedule for batch jobs * unveil: nomad paths get auto-unveiled with unveil_defaults https://github.com/hashicorp/nomad-driver-exec2/blob/v0.1.0/plugin/driver.go#L514-L522	2025-08-18 14:58:00 -04:00
Piotr Kazmierczak	e86d815472	scheduler: avoid importing the Planner test harness in scheduler calls (#26544 ) For a while now, we've had only 2 implementations of the Planner interface in Nomad: one was the Worker, and the other was the scheduler test harness, which was then used as argument to the scheduler constructors in FSM and job endpoint RPC. That's not great, and one of the recent refactors made it apparent that we're importing testing code in places we really shouldn't. We finally got called out for it, and this PR attempts to remedy the situation by splitting the Harness into Plan (which contains actual plan submission logic) and separating it from testing code.	2025-08-18 19:35:34 +02:00
Daniel Bennett	fdd46e6fd3	docs: cni: add tproxy conflist example (#26532 )	2025-08-18 12:04:34 -04:00
Aimee Ukasick	52b8deeb3b	Docs: Add 1.10.4 release notes (#26524 ) * 1.10.4 release notes * update node version in package.json so Vercel builds * revert node version * address feedback; add missing "-" to debug parms	2025-08-18 11:04:06 -05:00
Austin Workman	26f02c25c6	docs: Update virt install.mdx (#26531 ) Fixing plugin name in nomad client plugin config example.	2025-08-18 10:58:15 -05:00
Deniz Onur Duzgun	1f7e8cdda3	deps: bump go-getter to v1.7.9 (#26533 ) * deps: bump go-getter to v1.7.9 * add changelog * update changelog	2025-08-18 10:48:21 -04:00
Daniel Bennett	2c699b9794	sysbatch: fix panic from reschedule block (#26534 ) * fix panic from nil ReschedulePolicy commit `279775082c` (pr #26279) intended to return an error for sysbatch jobs with a reschedule block, but in bypassing populating the `ReschedulePolicy`'s pointer fields, a nil pointer panic occurred before the job could get rejected with the intended error. in particular, in `command/agent/job_endpoint.go`, `func ApiTgToStructsTG`, ``` if taskGroup.ReschedulePolicy != nil { tg.ReschedulePolicy = &structs.ReschedulePolicy{ Attempts: taskGroup.ReschedulePolicy.Attempts, Interval: taskGroup.ReschedulePolicy.Interval, ``` `taskGroup.ReschedulePolicy.Interval` was a nil pointer. fix e2e test jobs	2025-08-18 10:19:14 -04:00
James Rasell	1ae83114c1	ci: Run hclogvet across all codebase and fix found issue. (#26545 )	2025-08-18 15:06:11 +01:00
Matt McQuillan	fc0265c56d	admin: adjusting pr template for PCI compliance (#26541 )	2025-08-18 15:01:32 +01:00
Tim Gross	d1186ae53e	scheduler: don't suppress blocked evals on delay if previous expires (#26523 ) In #8099 we fixed a bug where garbage collecting a job with `disconnect.stop_on_client_after` would spawn recursive delayed evals. But when applied to disconnected allocs with `replace=true`, the fix prevents us from emitting a blocked eval if there's no room for the replacement. Update the guard on creating blocked evals so that rather than checking for `IsZero` that we check for being later than the `WaitUntil`. This separates this guard from the logic guarding the creation of delayed evals so that we can potentially create both when needed. Ref: https://github.com/hashicorp/nomad/pull/8099/files#r435198418	2025-08-15 10:53:52 -04:00
Frédéric Praca	7b9bebd653	[Doc] Fix link for Nomad event stream page (#26522 ) * fix(doc): fix links for task driver plugins host URL was wrong, changed from develoepr to developer * Update stateful-workloads.mdx Fix link for Nomad event stream page	2025-08-14 18:29:44 -05:00
Daniel Bennett	9f806e3063	Post 1.10.4 release main (#26521 ) * Generate files for 1.10.4 release * Prepare for next release * Merge release 1.10.4 files --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2025-08-14 12:22:32 -04:00
Aimee Ukasick	befc755f98	Docs Nomad Pack: Add CLI command reference (#26508 ) * Add CLI commands to Nomad Pack docs. * organize subcommands into directories * seo updates; style guide clean up	2025-08-14 09:22:42 -05:00
dependabot[bot]	e014fefb35	chore(deps): bump golang.org/x/sys from 0.34.0 to 0.35.0 (#26494 ) Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.34.0 to 0.35.0. - [Commits](https://github.com/golang/sys/compare/v0.34.0...v0.35.0) --- updated-dependencies: - dependency-name: golang.org/x/sys dependency-version: 0.35.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-13 16:31:50 -07:00
dependabot[bot]	2f512557aa	chore(deps): bump github.com/containerd/go-cni from 1.1.12 to 1.1.13 (#26492 ) Bumps [github.com/containerd/go-cni](https://github.com/containerd/go-cni) from 1.1.12 to 1.1.13. - [Release notes](https://github.com/containerd/go-cni/releases) - [Commits](https://github.com/containerd/go-cni/compare/v1.1.12...v1.1.13) --- updated-dependencies: - dependency-name: github.com/containerd/go-cni dependency-version: 1.1.13 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-13 16:11:04 -07:00
Joey	c997afe0de	chore: Fix function name in comment (#26511 )	2025-08-13 15:06:50 +01:00
James Rasell	d353fc53be	acl/identity: Ensure client intro token create can decode TTL. (#26512 )	2025-08-13 14:19:50 +01:00
Tim Gross	2d771f0f10	security: bypass scan for GO-2025-3829 (#26505 ) * security: bypass scan for GO-2025-3829 This report is unverified by upstream and has no release fixing it. In any case, this problem with firewalld doesn't impact Nomad's use of the dependency as a library, only the uses of it in `dockerd`. Bypass it from our scans for now. Ref: https://github.com/moby/moby/releases/tag/v28.3.3 Ref: https://pkg.go.dev/vuln/GO-2025-3829 * Update .release/security-scan.hcl Co-authored-by: Deniz Onur Duzgun <59659739+dduzgun-security@users.noreply.github.com> --------- Co-authored-by: Deniz Onur Duzgun <59659739+dduzgun-security@users.noreply.github.com>	2025-08-12 15:46:33 -04:00
dependabot[bot]	241881ecab	chore(deps): bump github.com/miekg/dns from 1.1.67 to 1.1.68 (#26493 ) Bumps [github.com/miekg/dns](https://github.com/miekg/dns) from 1.1.67 to 1.1.68. - [Changelog](https://github.com/miekg/dns/blob/master/Makefile.release) - [Commits](https://github.com/miekg/dns/compare/v1.1.67...v1.1.68) --- updated-dependencies: - dependency-name: github.com/miekg/dns dependency-version: 1.1.68 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 09:39:27 -07:00
dependabot[bot]	a7e345460d	chore(deps): bump google.golang.org/protobuf from 1.36.6 to 1.36.7 (#26491 ) Bumps google.golang.org/protobuf from 1.36.6 to 1.36.7. --- updated-dependencies: - dependency-name: google.golang.org/protobuf dependency-version: 1.36.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 09:35:00 -07:00
dependabot[bot]	acdd56aedc	chore(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#26490 ) Bumps [github.com/aws/aws-sdk-go-v2/feature/ec2/imds](https://github.com/aws/aws-sdk-go-v2) from 1.18.1 to 1.18.2. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.2/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.18.1...config/v1.18.2) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/feature/ec2/imds dependency-version: 1.18.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 09:32:11 -07:00
Aimee Ukasick	9bcfe7bd36	Docs: Update SSO with Auth0 guide (#26488 ) * initial * Update for Auth0 changes. * updated to end * fix URL with double forward slashes	2025-08-12 09:34:23 -05:00
James Rasell	7964c5ab18	auth: Build client introduction string for authenticated identity. (#26496 ) When emitting rate metrics, we use the identity string within the labels to better describe the caller. If the register RPC uses an introduction identity, we can correctly detail this.	2025-08-11 14:13:12 +01:00
Adiel Cristo	d4eb251004	fix(docs): remove incomplete phrase fragment (#26489 )	2025-08-11 07:40:36 -05:00
Juana De La Cuesta	225ac2938a	Add new metric for queue size to the autoscaler (#26453 ) * docs: add a new metric to the autoscaler for the size of the execution queue * Update telemetry.mdx * Update telemetry.mdx	2025-08-11 10:26:57 +02:00
Aimee Ukasick	d305f32017	Docs: Plugin authoring guide (#26395 ) * create plugin author guide; remove concepts/plugins * style guide; update links * update cni redirect * move host-volume plugin to /plugins/. Add arch host volume content. * Apply Jeff's style guide updates Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * Create Base plugin API section, link to BasePlugin interface --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2025-08-08 14:55:58 -05:00
Wim	f712d5db90	Add AllocIPv6 option to allow IPv6 address being used for service registration (#25632 ) Fixes #25627 by adding an extra `alloc_advertise_ipv6` option similar to the `AdvertiseIPv6Addr` with the docker driver config. Fixes: https://github.com/hashicorp/nomad/issues/25627	2025-08-08 15:01:46 -04:00
Alexey Kulakov	34025aa6b6	fix(website): node version bump from v18 to v22 (#26479 )	2025-08-08 10:54:35 -07:00
Alexey Kulakov	feae929075	chore: remove unused screenshots folder (#26478 )	2025-08-08 10:24:24 -07:00
Alexey Kulakov	0cac9813fa	fix(ui): remove old pattern for component templates (#26431 )	2025-08-08 10:03:42 -07:00
James Rasell	f11461178d	node identity: Allow reconnect after GC of node state object. (#26475 ) The Nomad garbage collector can be triggered manually which among other things will remove down nodes from state. If a cleaned node reconnects after this happens, it will be unable to reconnect with the cluster running strict enforcement, even if it has a valid node identity token. This change fixes the issue by allowing nodes to reconnect with a node identity, even if their state object has been removed by the GC process. This will only work if the node identity has not expired. If it has and strict enforcement is enabled, the operator will have to re-introuduce the node to the cluster which feels like expected and correct behaviour.	2025-08-08 16:07:59 +01:00
James Rasell	f5c02671e5	rpc: Move register args initial validation into separate function. (#26446 ) The RPC handler function is quite long, so moving the argument validation into its own function reduces this and makes sense from an organisation view.	2025-08-08 13:47:27 +01:00
Michael Smithhisler	b6f90d0562	docs: fix indent on vault create_from_role (#26472 )	2025-08-07 16:03:33 -05:00
Daniel Bennett	3c435d2953	docs: cni: add ipv6 bridge example (#26456 )	2025-08-07 16:16:45 -04:00
Tim Gross	5d8e8df7bd	docs: clarify consumers of environment variables for CLI (#26459 ) In https://github.com/hashicorp/nomad/issues/15459 we've had a bit of back-and-forth as a result of applying Nomad environment variables where they typically should not be used. Clarify that the env vars are for the CLI and mostly not for the agent. Also move the `NOMAD_CLI_SHOW_HINTS` description into the correct section.	2025-08-07 15:47:32 -04:00
Tim Gross	9717719502	docs: fix missing entry from template function_denylist (#26458 ) The docs for the `template` block accurately describe the template configuration default function denylist in the body but the default parameters are missing values. The equivalent docs in the `client` configuration are missing `executeTemplate` as well.	2025-08-07 15:47:14 -04:00
Allison Larson	e16a3339ad	Add CSI Volume Sentinel Policy scaffolding (#26438 ) * Add ent policy enforcement stubs to CSI Volume create/register * Wire policy override/warnings through CSI volume register/create * Add new scope to sentinel apply * Sanitize CSISecrets & CSIMountOptions * Add sentinel policy scope to ui * Update docs for new sentinel scope/policy * Create new api funcs for CSI endpoints * fix sentinel csi ui test * Update sentinel-policy docs * Add changelog * Update docs from feedback	2025-08-07 12:03:18 -07:00
Deniz Onur Duzgun	79bf619833	build: update toolchain to go 1.24.6 (#26451 ) * build: update toolchain to go 1.24.6 * add changelog	2025-08-07 08:44:41 -04:00
Tim Gross	6563d0ec3c	wait for service registration cleanup until allocs marked lost (#26424 ) When a node misses a heartbeat and is marked down, Nomad deletes service registration instances for that node. But if the node then successfully heartbeats before its allocations are marked lost, the services are never restored. The node is unaware that it has missed a heartbeat and there's no anti-entropy on the node in any case. We already delete services when the plan applier marks allocations as stopped, so deleting the services when the node goes down is only an optimization to more quickly divert service traffic. But because the state after a plan apply is the "canonical" view of allocation health, this breaks correctness. Remove the code path that deletes services from nodes when nodes go down. Retain the state store code that deletes services when allocs are marked terminal by the plan applier. Also add a path in the state store to delete services when allocs are marked terminal by the client. This gets back some of the optimization but avoids the correctness bug because marking the allocation client-terminal is a one way operation. Fixes: https://github.com/hashicorp/nomad/issues/16983	2025-08-06 13:40:37 -04:00
Aimee Ukasick	a30cb2f137	Update UI, code comment, and README links to docs, tutorials (#26429 ) * Update UI, code comment, and README links to docs, tutorials * fix typo in ephemeral disks learn more link url * feedback on typo Co-authored-by: Tim Gross <tgross@hashicorp.com> --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-08-06 09:40:23 -05:00
James Rasell	1c63ad50d9	Merge pull request #26430 from hashicorp/f-NMD-763-introduction introduction: The initial implementation code for node introduction.	2025-08-06 14:41:16 +02:00
James Rasell	622def8bcf	test: Ensure client rpclogger is set on RPC only client. (#26443 ) If a test encounters an RPC error using the test client, it will panic as the rpc logger is not set when it attempts to log the error.	2025-08-06 10:20:28 +01:00
Michael Schurter	0f630004b9	docs: Once -> once (#26435 )	2025-08-05 11:10:25 -07:00
Tim Gross	0ae5b3f39b	eval status: sort plan annotations by task group (#26428 ) The plan annotations table isn't sorted by task group, which makes for a less beautiful UX and a flaky test.	2025-08-05 09:36:12 -04:00
James Rasell	ad508616dc	Merge branch 'main' into f-NMD-763-introduction	2025-08-05 08:56:51 +01:00
James Rasell	350662c88e	Merge pull request #26291 from hashicorp/f-NMD-763-identity identity: The initial implementation code for node identity.	2025-08-05 09:52:28 +02:00
James Rasell	80a26306bf	intro: Add node introduction flow for Nomad client registration. (#26405 ) This change implements the client -> server workflow for Nomad node introduction. A Nomad node can optionally be started with an introduction token, which is a signed JWT containing claims for the node registration. The server handles this according to the enforcement configuration. The introduction token can be provided by env var, cli flag, or by placing it within a default filesystem location. The latter option does not override the CLI or env var. The region claims has been removed from the initial claims set of the intro identity. This boundary is guarded by mTLS and aligns with the node identity.	2025-08-05 08:23:44 +01:00

1 2 3 4 5 ...

27357 Commits