nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-09 20:05:42 +03:00

Author	SHA1	Message	Date
Tim Gross	d4ab277154	docs: add missing metrics for Consul service client (#26186 ) Nomad agents emit metrics for Consul service and check operations, but these were not documented. Update the metrics reference table to include these metrics. Note that the metrics are prefixed `nomad.client` but are present on all agents, because the server registers itself in Consul as well.	2025-07-07 09:40:32 -04:00
Tim Gross	60a953ca00	docs: add upgrade guide note for publish_allocation_metrics (#26187 ) In #25870 we fixed a longstanding bug where allocation metrics were being collected and published even if `telemetry.publish_allocation_metrics` was disabled (the default). This change is unexpected enough that we should surface it in the upgrade guide. Ref: https://github.com/hashicorp/nomad/pull/25870 Ref: https://github.com/hashicorp/nomad/issues/26166	2025-07-07 09:40:00 -04:00
Allison Larson	004fa6132b	docs: Fix link in service page documentation (#26174 ) * docs: fix link in service page * docs: correct indentation	2025-07-03 09:42:52 -07:00
Allison Larson	63f0788747	Expose Kind field for Consul Service Registrations (#26170 ) * consul: Add service kind to jobspec * consul: Add kind to service docs * Add changelog	2025-06-30 14:32:23 -07:00
Tim Gross	aa3c08d069	eval status: enrich with related evals and placed allocs tables (#26156 ) When debugging an evaluation, you almost always want to know about all the related evaluations and what allocations were placed by that evaluation (and where), not just failed placements. We can enrich the command by adding the `related` query parameter to the API, and having the command query for the evaluations allocations automatically. Emit this data as a pair of new tables and expose fields like quota limits, and previous/next/blocked eval without the `-verbose` flag. Update the docs to include the full output and remove references to long-removed behavior of the `-json` flag. Ref: https://hashicorp.atlassian.net/browse/NMD-818 Ref: https://go.hashi.co/rfc/nmd-212	2025-06-30 09:23:36 -04:00
Piotr Kazmierczak	0c2fcb3e30	docs: explicitly list all schedulers enabled by default (#26150 ) Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-06-26 17:37:26 +02:00
Mattias Fjellström	8e6b2e1b63	docs: adding note on azure msi for server join (#26141 )	2025-06-26 10:29:06 +02:00
James Rasell	216140255d	cli: Do not always add global DNS name to certificate DNS names. (#26086 ) No matter the passed region identifier, the CLI was always adding "<role>.global.nomad" to the certificate DNS names. This is not what we expect and has been removed. While here, the long deprecated cluster-region flag has been removed. This removal only impacts CLI functionality, so is safe to do.	2025-06-25 07:35:56 +01:00
Mattias Fjellström	e2a30df14c	docs: clarified azure cloud join requirements (#26091 )	2025-06-23 08:34:56 -05:00
Aimee Ukasick	cdde082362	Docs bug: Fix broken link on concepts/job.mdx (#26093 )	2025-06-20 17:16:33 -05:00
Tim Gross	4eb78f1348	docs: describe shutdown order on `lifecycle` page (#26035 ) We have a description of the order of shutdown in the `task.leader` docs, but the `lifecycle` block is an intuitive place to look for this same information, and the behavior is largely governed by that feature anyways.	2025-06-12 15:45:40 -04:00
Aimee Ukasick	23fd87d9c9	Docs: Commands section move "General options" to page bottom (#26001 ) * sectionless files plus acl section * alloc section * config, deployment sections * job section * licence, namespace * node, node-pool * operator * plugin, quota, recommendation * scaling, sentinel, server, service, system, var, volume * Add "ENT" label to left nav for enterprise commands * job tag break into separate folder and files; update options header	2025-06-12 14:31:38 -05:00
Daniel Bennett	7519df8d06	task env: add NOMAD_UNIX_ADDR var (#25598 ) for easier setup when using workload identity + task api	2025-06-11 15:56:51 -04:00
Conor Mongey	f7096fb9d6	docker: add cgroupns task config (#25927 )	2025-06-11 13:50:44 -04:00
Bram Vogelaar	68b5d64ed7	docs: update broken link in stateful-workloads.mdx (#26009 ) point to correct url	2025-06-09 08:36:37 -04:00
Tim Gross	6c630c4bfa	docs: expand on recommendations for CPU resource reservation (#25964 ) Add some prescriptive guidance to the CPU concepts document around when to use `resources.cores` vs `resources.cpu`. Extend some of the text to cover cgroups v2. Ref: https://hashicorp.atlassian.net/browse/NMD-297 Ref: https://go.hashi.co/rfc/nmd-211 Ref: https://github.com/hashicorp/nomad/pull/25963	2025-06-03 15:57:04 -04:00
James Rasell	ae3eaf80d1	docs: Fix node pool concept missing backtick for style. (#25956 )	2025-06-02 09:09:35 +01:00
Michael Smithhisler	4c8257d0c7	client: add once mode to template block (#25922 )	2025-05-28 11:45:11 -04:00
Piotr Kazmierczak	5dd880ad61	docs: upgrade guide entry for /v1/acl/token/self changes (#25940 ) During #25547 and #25588 work, incorrect response codes from /v1/acl/token/self were changed, but we did not make a note about this in the upgrade guide.	2025-05-28 16:22:37 +02:00
Tim Gross	3f59860254	host volumes: add configuration to GC on node GC (#25903 ) When a node is garbage collected, any dynamic host volumes on the node are orphaned in the state store. We generally don't want to automatically collect these volumes and risk data loss, and have provided a CLI flag to `-force` remove them in #25902. But for clusters running on ephemeral cloud instances (ex. AWS EC2 in an autoscaling group), deleting host volumes may add excessive friction. Add a configuration knob to the client configuration to remove host volumes from the state store on node GC. Ref: https://github.com/hashicorp/nomad/pull/25902 Ref: https://github.com/hashicorp/nomad/issues/25762 Ref: https://hashicorp.atlassian.net/browse/NMD-705	2025-05-27 10:22:08 -04:00
James Rasell	e3fea745eb	docs: Remove long removed client iops metrics from monitoring page. (#25926 )	2025-05-23 16:14:16 +01:00
tehut	55523ecf8e	Add NodeMaxAllocations to client configuration (#25785 ) * Set MaxAllocations in client config Add NodeAllocationTracker struct to Node struct Evaluate MaxAllocations in AllocsFit function Set up cli config parsing Integrate maxAllocs into AllocatedResources view Co-authored-by: Tim Gross <tgross@hashicorp.com> --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-05-22 12:49:27 -07:00
Aimee Ukasick	c12ad24de0	Docs: SEO updates to operations, other specs sections (#25518 ) * seo operation section * other specifications section * Update website/content/docs/other-specifications/variables.mdx Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2025-05-22 07:47:05 -05:00
Chris Roberts	1aa416e2f2	Support applying policy to all jobs within namespace (#25871 ) Workflow identities currently support ACL policies being applied to a job ID within a namespace. With this update an ACL policy can be applied to a namespace. This results in the ACL policy being applied to all jobs within the namespace.	2025-05-21 07:44:14 -07:00
Tim Gross	41cf1b03b4	host volumes: -force flag for delete (#25902 ) When a node is garbage collected, we leave behind the dynamic host volume in the state store. We don't want to automatically garbage collect the volumes and risk data loss, but we should allow these to be removed via the API. Fixes: https://github.com/hashicorp/nomad/issues/25762 Fixes: https://hashicorp.atlassian.net/browse/NMD-705	2025-05-21 08:55:52 -04:00
Piotr Kazmierczak	cdc308a0eb	wi: new endpoint for listing workload attached ACL policies (#25588 ) This introduces a new HTTP endpoint (and an associated CLI command) for querying ACL policies associated with a workload identity. It allows users that want to learn about the ACL capabilities from within WI-tasks to know what sort of policies are enabled. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-05-19 19:54:12 +02:00
Piotr Kazmierczak	953910dc5d	docs: emphasize HOME and USER env vars for tasks that use custom `user` setting (#25879 ) In #25859 we fixed the task environment variables to account for user field setting. This PR follows up with documentation adjustments.	2025-05-19 19:00:54 +02:00
Aimee Ukasick	986f3c727a	Docs: SEO job spec section (#25612 ) * action page * change all page_title fields * update title * constraint through migrate pages * update page title and heading to use sentence case * fix front matter description * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2025-05-19 09:02:07 -05:00
Martina Santangelo	18eddf53a4	commands: adds job start command to start stopped jobs (#24150 ) --------- Co-authored-by: Michael Smithhisler <michael.smithhisler@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-05-14 15:17:44 -04:00
Tim Gross	8a5a057d88	offline license utilization reporting (#25844 ) Nomad Enterprise users operating in air-gapped or otherwise secured environments don't want to send license reporting metrics directly from their servers. Implement manual/offline reporting by periodically recording usage metrics snapshots in the state store, and providing an API and CLI by which cluster administrators can download the snapshot for review and out-of-band transmission to HashiCorp. This is the CE portion of the work required for implemention in the Enterprise product. Nomad CE does not perform utilization reporting. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2673 Ref: https://hashicorp.atlassian.net/browse/NMD-68 Ref: https://go.hashi.co/rfc/nmd-210	2025-05-14 09:51:13 -04:00
Aimee Ukasick	79d35f072a	Move environment section; CE-712 (#25845 )	2025-05-13 12:31:08 -05:00
James Rasell	0b265d2417	encrypter: Track initial tasks for is ready calculation. (#25803 ) The server startup could "hang" to the view of an operator if it had a key that could not be decrypted or replicated loaded from the FSM at startup. In order to prevent this happening, the server startup function will now use a timeout to wait for the encrypter to be ready. If the timeout is reached, the error is sent back to the caller which fails the CLI command. This bubbling of error message will also flush to logs which will provide addition operator feedback. The server only cares about keys loaded from the FSM snapshot and trailing logs before the encrypter should be classed as ready. So that the encrypter ready function does not get blocked by keys added outside of the initial Raft load, we take a snapshot of the decryption tasks as we enter the blocking call, and class these as our barrier.	2025-05-07 15:38:16 +01:00
Juana De La Cuesta	dfc1412e22	Merge pull request #25721 from hashicorp/NMD-321-reload Force an agent return if there is an error on reload	2025-05-01 14:43:08 +02:00
Juana De La Cuesta	dcaa96f0e5	Update website/content/docs/upgrade/upgrade-specific.mdx Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-30 15:03:49 +02:00
Juana De La Cuesta	e8fb36f4d3	Style: typo	2025-04-30 13:01:57 +02:00
Juanadelacuesta	9288a3141a	func and docs: Use the config from the client and not from the agent that is already parsed. Add the breaking change to the release notes	2025-04-30 10:53:02 +02:00
Aimee Ukasick	4075b0b8ba	Docs: Add garbage collection page (#25715 ) * add garbage collection page * finish client; add resources section * finish server section; task driver section * add front matter description * fix typos * Address Tim's feedback	2025-04-28 08:37:23 -05:00
tehut	b11619010e	Add priority flag to Dispatch CLI and API (#25622 ) * Add priority flag to Dispatch CLI and DispatchOpts() helper to HTTP API	2025-04-18 13:24:52 -07:00
Aimee Ukasick	d293684d3d	Update rel notes, upgrade links to point to correct previous ver (#25652 )	2025-04-11 10:22:23 -05:00
Ranjandas	8b33584fbf	Add note to root keyring remove command (#25637 ) * Add note to root keyring remove command This PR updates the documentation for the root keyring remove command to note that the full key ID must be provided for the command to function correctly. * Move keyID explanation to usage section --------- Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-04-10 08:58:48 -05:00
Tim Gross	27caae2b2a	api: make attempting to remove peer by address a no-op (#25599 ) In Nomad 1.4.0 we removed support for Raft Protocol v2 entirely. But the `Operator.RemoveRaftPeerByAddress` RPC handler was left in place, along with its supporting HTTP API and command line flags. Using this API will always result in the Raft library error "operation not supported with current protocol version". Unfortunately it's still possible in unit tests to exercise this code path, and these tests are quite flaky. This changeset turns the RPC handler and HTTP API into a no-op, removes the associated command line flags, and removes the flaky tests. I've also cleaned up the test for `RemoveRaftPeerByID` to consolidate test servers and use `shoenig/test`. Fixes: https://hashicorp.atlassian.net/browse/NET-12413 Ref: https://github.com/hashicorp/nomad/pull/13467 Ref: https://developer.hashicorp.com/nomad/docs/upgrade/upgrade-specific#raft-protocol-version-2-unsupported Ref: https://github.com/hashicorp/nomad-enterprise/actions/runs/13201513025/job/36855234398?pr=2302	2025-04-10 09:19:25 -04:00
Aimee Ukasick	87aabc9af2	Docs: 1.10 release notes, some factoring, sentinel apply update (#25433 ) * Docs: 1.10 release notes and upgrade factoring * Update based on code review suggestions * add CLI for disabling UI URL hints * fix indentation * nav: list release notes in reverse order fix broken link to v1.6.x docs * Update PKCE section from Daniel's latest PR * update pkce per daniel's suggestion * Add dynamic host volumes governance section from blog	2025-04-09 15:43:58 -07:00
Daniel Bennett	6a0c4f5a3d	auth: oidc: enable pkce only on new auth methods (#25593 ) trying not to violate the principle of least astonishment. we want to only auto-enable PKCE on new auth methods, rather than new or updated auth methods, to avoid a scenario where a Nomad admin updates an auth method sometime in the future -- something innocent like a new client secret -- and their OIDC provider doesn't like PKCE. the main concern is that the provider won't like PKCE in a totally confusing way. error messages rarely say PKCE directly, so why the user's auth method suddenly broke would be a big mystery. this means that to enable it on existing auth methods, you would set `OIDCDisablePKCE = false`, and the double- negative doesn't feel right, so instead, swap the language, so enabling it on existing methods reads sensibly, and to disable it on new methods reads ok-enough: `OIDCEnablePKCE = false`	2025-04-03 10:56:17 -05:00
Denis Rodin	aca0ff438a	raw_exec windows: add support for setting the task user (#25496 )	2025-04-03 11:21:13 -04:00
tehut	27b1d470a8	modify rawexec TaskConfig and Config to accept envvar denylist (#25511 ) * modify rawexec TaskConfig and Config to accept envvar denylist * update rawexec driver docs to include deniedEnvars options Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-04-02 12:25:28 -07:00
Nikita Eliseev	76fb3eb9a1	rpc: added configuration for yamux session (#25466 ) Fixes: https://github.com/hashicorp/nomad/issues/25380	2025-04-02 10:58:23 -04:00
Aimee Ukasick	9778fa4912	Docs: Fix broken links in main for 1.10 release (#25540 ) * Docs: Fix broken links in main for 1.10 release * Implement Tim's suggestions * Remove link to Portworx from ecosystem page * remove "Portworx" since Portworx 3.2 no longer supports Nomad	2025-04-01 09:09:44 -05:00
Tim Gross	cdd40cf81b	docs: document requirements for Consul tokens in admin partitions (#25529 ) When using Nomad with Consul, each Nomad agent is expected to have a Consul agent running alongside. When using Nomad Enterprise and Consul Enterprise together, the Consul agent may be in a Consul admin partition. In order for Nomad's "anti-entropy" sync to work with Consul, the Consul ACL token and ACL policy for the Nomad client must be in the same admin partition as the Consul agent. Otherwise, we can register services (via WI) but then won't be able to deregister them unless they're the default namespace. Ref: https://hashicorp.atlassian.net/browse/NET-12361	2025-04-01 08:45:05 -04:00
Allison Larson	17d191ae24	Add -group flag to `alloc exec`, `alloc logs` command (#25568 ) * Add -group flag to `alloc exec`, `alloc logs` command * fixup! Add -group flag to `alloc exec`, `alloc logs` command * Add -group option to alloc fs * Add changelog	2025-03-31 14:17:45 -07:00
Sooter Saalu	e93bda31ea	Update placement.mdx (#25538 ) * Update placement.mdx Added explanations on initial and blocked evaluation for placement failures. fixes #24824 * Update website/content/docs/concepts/scheduling/placement.mdx Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com> * Update website/content/docs/concepts/scheduling/placement.mdx Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com> --------- Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-03-31 09:08:06 -05:00

1 2 3 4 5 ...

1188 Commits