nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Chris Roberts	33a72c2d01	[landlock] Allow read access for random content (#26510 ) When attempting to clone a git repository within a sandbox that is configured with landlock, the clone will fail with error messages related to inability to get random bytes for a temporary file. Including a read rule for `/dev/urandom` resolves the error and the git clone works as expected.	2025-08-22 14:04:55 -07:00
James Rasell	3b0b7db1a1	client: Add client identity API, CLI, and RPC workflow. (#26543 ) The Nomad clients store their Nomad identity in memory and within their state store. While active, it is not possible to dump the state to view the stored identity token, so having a way to view the current claims while running aids debugging and operations. This change adds a client identity workflow, allowing operators to view the current claims of the nodes identity. It does not return any of the signing key material.	2025-08-19 08:25:51 +01:00
Wim	f712d5db90	Add AllocIPv6 option to allow IPv6 address being used for service registration (#25632 ) Fixes #25627 by adding an extra `alloc_advertise_ipv6` option similar to the `AdvertiseIPv6Addr` with the docker driver config. Fixes: https://github.com/hashicorp/nomad/issues/25627	2025-08-08 15:01:46 -04:00
James Rasell	1c63ad50d9	Merge pull request #26430 from hashicorp/f-NMD-763-introduction introduction: The initial implementation code for node introduction.	2025-08-06 14:41:16 +02:00
James Rasell	622def8bcf	test: Ensure client rpclogger is set on RPC only client. (#26443 ) If a test encounters an RPC error using the test client, it will panic as the rpc logger is not set when it attempts to log the error.	2025-08-06 10:20:28 +01:00
James Rasell	ad508616dc	Merge branch 'main' into f-NMD-763-introduction	2025-08-05 08:56:51 +01:00
James Rasell	350662c88e	Merge pull request #26291 from hashicorp/f-NMD-763-identity identity: The initial implementation code for node identity.	2025-08-05 09:52:28 +02:00
James Rasell	80a26306bf	intro: Add node introduction flow for Nomad client registration. (#26405 ) This change implements the client -> server workflow for Nomad node introduction. A Nomad node can optionally be started with an introduction token, which is a signed JWT containing claims for the node registration. The server handles this according to the enforcement configuration. The introduction token can be provided by env var, cli flag, or by placing it within a default filesystem location. The latter option does not override the CLI or env var. The region claims has been removed from the initial claims set of the intro identity. This boundary is guarded by mTLS and aligns with the node identity.	2025-08-05 08:23:44 +01:00
tehut	21841d3067	Add historical journald and log export flags to operator debug command (#26410 ) * Add -log-file-export and -log-lookback commands to add historical log to debug capture * use monitor.PrepFile() helper for other historical log tests	2025-08-04 13:55:25 -07:00
tehut	d709accaf5	Add nomad monitor export command (#26178 ) * Add MonitorExport command and handlers * Implement autocomplete * Require nomad in serviceName * Fix race in StreamReader.Read * Add and use framer.Flush() to coordinate function exit * Add LogFile to client/Server config and read NomadLogPath in rpcHandler instead of HTTPServer * Parameterize StreamFixed stream size	2025-08-01 10:26:59 -07:00
James Rasell	f2417ffb89	ci: Update hclogvet and correctly run across codebase. (#26362 )	2025-07-28 14:15:33 +01:00
James Rasell	5989d5862a	ci: Update golangci-lint to v2 and fix highlighted issues. (#26334 )	2025-07-25 10:44:08 +01:00
James Rasell	dce4284361	Merge branch 'main' into f-NMD-763-identity	2025-07-17 07:35:16 +01:00
James Rasell	953a149180	client: Allow operators to force a client to renew its identity. (#26277 ) The Nomad client will have its identity renewed according to the TTL which defaults to 24h. In certain situations such as root keyring rotation, operators may want to force clients to renew their identities before the TTL threshold is met. This change introduces a client HTTP and RPC endpoint which will instruct the node to request a new identity at its next heartbeat. This can be used via the API or a new command. While this is a manual intervention step on top of the any keyring rotation, it dramatically reduces the initial feature complexity as it provides an asynchronous and efficient method of renewal that utilises existing functionality.	2025-07-16 14:56:00 +01:00
Daniel Bennett	089c148236	allocrunner: run all postrun hooks, even on error (#26271 ) e.g. if the consul postrun hook fails, continue running the subsequent postrun hooks, which among other things includes network/CNI/iptables cleanup.	2025-07-14 13:55:33 -04:00
James Rasell	8096ea4129	client: Handle identities from servers and use for RPC auth. (#26218 ) Nomad servers, if upgraded, can return node identities as part of the register and update/heartbeat response objects. The Nomad client will now handle this and store it as appropriate within its memory and statedb. The client will now use any stored identity for RPC authentication with a fallback to the secretID. This supports upgrades paths where the Nomad clients are updated before the Nomad servers.	2025-07-14 14:24:43 +01:00
James Rasell	7c5a5782bc	client: Use single time variable when handling heartbeat response. (#26238 ) When the client handles an update status response from the server, it modifies its heartbeat stop tracker with a time set once the RPC call returns. It optionally also emits a log message, if the client suspects it has missed a heartbeat. These times were originally tracked by two different calls to the time function which were executed 2 microseconds apart. There is no reason we cannot use a single time variable for both uses which saves us one whole call to time.Now.	2025-07-10 08:07:32 +01:00
Juana De La Cuesta	3b44090156	Avoid panic during startup with 1.10.2 (#26219 ) * fix: initalize the topology of teh processors to avoid nil pointers * func: initialize topology to avoid nil pointers * fix: update the new public method for NodeProcessorResources	2025-07-08 16:07:14 +02:00
James Rasell	2f30205102	client: Add state functionality for set and get client identities. (#26184 ) The Nomad client will persist its own identity within its state store for restart persistence. The added benefit of using it over the filesystem is that it supports transactions. This is useful when considering the identity will be renewed periodically.	2025-07-07 15:28:27 +01:00
James Rasell	e158356dd2	client: Remove created directory when mkdir plugin fails to chown. (#26194 ) The mkdir plugin creates the directory and then chowns it. In the event the chown command fails, we should attempt to remove the directory. Without this, we leave directories on the client in partial failure situations.	2025-07-04 08:36:36 +01:00
Chris Roberts	362690ddd1	client: suppress kill task event on completed tasks (#26075 ) The `killTasks` function will kill all the alloc runners task runners. If the task of a task runner has already completed, the killing of the task runner can cause confusion due to the task event showing that the task was signaled even though it is already complete. To prevent this, a check is done when creating the task event to determine if the task has completed. If it has no task event is created and when the task runner is killed, no extra task event is added.	2025-07-01 13:30:52 -07:00
James Rasell	d5b2d5078b	rpc: Generate node identities with node RPC handlers when needed. (#26165 ) When a Nomad client register or re-registers, the RPC handler will generate and return a node identity if required. When an identity is generated, the signing key ID will be stored within the node object, to ensure a root key is not deleted until it is not used. During normal client operation it will periodically heartbeat to the Nomad servers to indicate aliveness. The RPC handler that is used for this action has also been updated to conditionally perform identity generation. Performing it here means no extra RPC handlers are required and we inherit the jitter in identity generation from the heartbeat mechanism. The identity generation check methods are performed from the RPC request arguments, so they a scoped to the required behaviour and can handle the nuance of each RPC. Failure to generate an identity is considered terminal to the RPC call. The client will include behaviour to retry this error which is always caused by the encrypter not being ready unless the servers keyring has been corrupted.	2025-07-01 16:07:21 +01:00
James Rasell	325048c898	Merge branch 'main' into f-NMD-763-identity	2025-06-24 08:42:33 +01:00
James Rasell	26c3f19129	identity: Base server objects and mild rework of identity implementation to support node identities. (#26052 ) When Nomad generates an identity for a node, the root key used to sign the JWT will be stored as a field on the node object and written to state. To provide fast lookup of nodes by their signing key, the node table schema has been modified to include the keyID as an index. In order to ensure a root key is not deleted while identities are still actively signed by it, the Nomad state has an in-use check. This check has been extended to cover node identities. Nomad node identities will have an expiration. The expiration will be defined by a TTL configured within the node pool specification as a time duration. When not supplied by the operator, a default value of 24hr is applied. On cluster upgrades, a Nomad server will restore from snapshot and/or replay logs. The FSM has therefore been modified to ensure restored node pool objects include the default value. The builtin "all" and "default" pools have also been updated to include this default value. Nomad node identities will be a new identity concept in Nomad and will exist alongside workload identities. This change introduces a new envelope identity claim which contains generic public claims as well as either a node or workload identity claims. This allows us to use a single encryption and decryption path, no matter what the underlying identity. Where possible node and workload identities will use common functions for identity claim generation. The new node identity has the following claims: * "nomad_node_id" - the node ID which is typically generated on the first boot of the Nomad client as a UUID within the "ensureNodeID" function. * "nomad_node_pool" - the node pool is a client configuration parameter which provides logical grouping of Nomad clients. * "nomad_node_class" - the node class is a client configuration parameter which provides scheduling constraints for Nomad clients. * "nomad_node_datacenter" - the node datacenter is a client configuration parameter which provides scheduling constraints for Nomad clients and a logical grouping method.	2025-06-18 07:43:27 +01:00
Tim Gross	26004c5407	vault: set renew increment to lease duration (#26041 ) When we renew Vault tokens, we use the lease duration to determine how often to renew. But we also set an `increment` value which is never updated from the initial 30s. For periodic tokens this is not a problem because the `increment` field is ignored on renewal. But for non-periodic tokens this prevents the token TTL from being properly incremented. This behavior has been in place since the initial Vault client implementation in #1606 but before the switch to workload identity most (all?) tokens being created were periodic tokens so this was never detected. Fix this bug by updating the request's `increment` field to the lease duration on each renewal. Also switch out a `time.After` call in backoff of the derive token caller with a safe timer so that we don't have to spawn a new goroutine per loop, and have tighter control over when that's GC'd. Ref: https://github.com/hashicorp/nomad/pull/1606 Ref: https://github.com/hashicorp/nomad/issues/25812	2025-06-13 13:50:54 -04:00
Chris Roberts	dfa07e10ed	client: fix batch job drain behavior (#26025 ) Batch job allocations that are drained from a node will be moved to an eligible node. However, when no eligible nodes are available to place the draining allocations, the tasks will end up being complete and will not be placed when an eligible node becomes available. This occurs because the drained allocations are simultaneously stopped on the draining node while attempting to be placed on an eligible node. The stopping of the allocations on the draining node result in tasks being killed, but importantly this kill does not fail the task. The result is tasks reporting as complete due to their state being dead and not being failed. As such, when an eligible node becomes available, all tasks will show as complete and no allocations will need to be placed. To prevent the behavior described above a check is performed when the alloc runner kills its tasks. If the allocation's job type is batch, and the allocation has a desired transition of migrate, the task will be failed when it is killed. This ensures the task does not report as complete, and when an eligible node becomes available the allocations are placed as expected.	2025-06-13 08:28:31 -07:00
Daniel Bennett	7519df8d06	task env: add NOMAD_UNIX_ADDR var (#25598 ) for easier setup when using workload identity + task api	2025-06-11 15:56:51 -04:00
Deniz Onur Duzgun	abd0efdd76	sec: remove non-hermetic sprig template functions (#25998 ) * sec:add sprig template functions in denylists * remove explicit set which is no longer needed * go mod tidy * add changelog * better changelog and filtered denylist * go mod tidy with 1.24.4 * edit changelog and remove htpasswd and derive * fix tests * Update client/allocrunner/taskrunner/template/template_test.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * edit changelog --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-06-09 13:00:47 -04:00
Tim Gross	48b1b01e69	prevent client deadlock and incorrect timing on stop_on_client_after (#25946 ) The `disconnect.stop_on_client_after` feature is implemented as a loop on the client that's intended to wait on the shortest timeout of all the allocations on the node and then check whether the interval since the last heartbeat has been longer than the timeout. It uses a buffered channel of allocations written and read from the same goroutine to push "stops" from the timeout expiring to the next pass through the loop. Unfortunately if there are multiple allocations that need to be stopped in the same timeout event, or even if a previous event has not yet been dequeued, then sending on the channel will block and the entire goroutine deadlocks itself. While fixing this, I also discovered that the `stop_on_client_after` and heartbeat loops can synchronize in a pathological way that extends the `stop_on_client_after` window. If a heartbeat fails close to the beginning of the shortest `stop_on_client_after` window, the loop will end up waiting until almost 2x the intended wait period. While fixing both of those issues, I discovered that the existing tests had a bug such that we were asserting that an allocrunner was being destroyed when it had already exited. This commit includes the following: * Rework the watch loop so that we handle the stops in the same case as the timer expiration, rather than using a channel in the method scope. * Remove the alloc intervals map field from the struct and keep it in the method scope, in order to discourage writing racy tests that read its value. * Reset the timer whenever we receive a heartbeat, which forces the two intervals to synchronize correctly. * Minor refactoring of the disconnect timeout lookup to improve brevity. Fixes: https://github.com/hashicorp/nomad/issues/24679 Ref: https://hashicorp.atlassian.net/browse/NMD-407	2025-05-29 15:05:33 -04:00
Michael Smithhisler	4c8257d0c7	client: add once mode to template block (#25922 )	2025-05-28 11:45:11 -04:00
Tim Gross	3f59860254	host volumes: add configuration to GC on node GC (#25903 ) When a node is garbage collected, any dynamic host volumes on the node are orphaned in the state store. We generally don't want to automatically collect these volumes and risk data loss, and have provided a CLI flag to `-force` remove them in #25902. But for clusters running on ephemeral cloud instances (ex. AWS EC2 in an autoscaling group), deleting host volumes may add excessive friction. Add a configuration knob to the client configuration to remove host volumes from the state store on node GC. Ref: https://github.com/hashicorp/nomad/pull/25902 Ref: https://github.com/hashicorp/nomad/issues/25762 Ref: https://hashicorp.atlassian.net/browse/NMD-705	2025-05-27 10:22:08 -04:00
tehut	55523ecf8e	Add NodeMaxAllocations to client configuration (#25785 ) * Set MaxAllocations in client config Add NodeAllocationTracker struct to Node struct Evaluate MaxAllocations in AllocsFit function Set up cli config parsing Integrate maxAllocs into AllocatedResources view Co-authored-by: Tim Gross <tgross@hashicorp.com> --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-05-22 12:49:27 -07:00
Tim Gross	77c8acb422	telemetry: fix excessive CPU consumption in executor (#25870 ) Collecting metrics from processes is expensive, especially on platforms like Windows. The executor code has a 5s cache of stats to ensure that we don't thrash syscalls on nodes running many allocations. But the timestamp used to calculate TTL of this cache was never being set, so we were always treating it as expired. This causes excess CPU utilization on client nodes. Ensure that when we fill the cache, we set the timestamp. In testing on Windows, this reduces exector CPU overhead by roughly 75%. This changeset includes two other related items: * The `telemetry.publish_allocation_metrics` field correctly prevents a node from publishing metrics, but the stats hook on the taskrunner still collects the metrics, which can be expensive. Thread the configuration value into the stats hook so that we don't collect if `telemetry.publish_allocation_metrics = false`. * The `linuxProcStats` type in the executor's `procstats` package is misnamed as a result of a couple rounds of refactoring. It's used by all task executors, not just Linux. Rename this and move a comment about how Windows processes are listed so that the comment is closer to where the logic is implemented. Fixes: https://github.com/hashicorp/nomad/issues/23323 Fixes: https://hashicorp.atlassian.net/browse/NMD-455	2025-05-19 09:24:13 -04:00
Piotr Kazmierczak	0fa0624576	exec: Fix incorrect `HOME` and `USER` env variables for tasks that have `user` set (#25859 ) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-05-16 15:02:45 +02:00
James Rasell	be84613dc3	test: Only run and lint Linux network hook test on Linux. (#25858 )	2025-05-15 13:33:37 +01:00
Piotr Kazmierczak	32ca833c70	client: unflake TestClient_ACL_ResolveToken_InvalidClaims (#25758 )	2025-04-25 14:53:09 +02:00
Michael Smithhisler	6036ab8b40	client: close namespace file handle and defensively lazy unmount (#25714 )	2025-04-21 16:25:05 -04:00
James Rasell	c85c723336	ci: Run core tests groups workflow on amd64 and arm64 runners. (#25695 )	2025-04-17 15:16:29 +01:00
James Rasell	85c30dfd1e	test: Remove use of "mitchellh/go-testing-interface" for stdlib. (#25640 ) The stdlib testing package now includes this interface, so we can remove our dependency on the external library.	2025-04-14 07:43:49 +01:00
James Rasell	4c4cb2c6ad	agent: Fix misaligned contextual k/v logging arguments. (#25629 ) Arguments passed to hclog log lines should always have an even number to provide the expected k/v output.	2025-04-10 14:40:21 +01:00
Nikita Eliseev	76fb3eb9a1	rpc: added configuration for yamux session (#25466 ) Fixes: https://github.com/hashicorp/nomad/issues/25380	2025-04-02 10:58:23 -04:00
Tim Gross	1a1ccec8b2	CNI: add warning log for CNI check command failures (#25581 ) In #24658 we fixed a bug around client restarts where we would not assert network namespaces existed and were properly configured when restoring allocations. We introduced a call to the CNI `Check` method so that the plugins could report correct config. But when we get an error from this call, we don't log it unless the error is fatal. This makes it challenging to debug the case where the initial check fails but we tear down the network and try again (as described in #25510). Add a noisy log line here. Ref: https://github.com/hashicorp/nomad/pull/24658 Ref: https://github.com/hashicorp/nomad/issues/25510	2025-04-02 10:43:05 -04:00
James Rasell	3ffe6e5f53	test: Move client server manager tests to use must library. (#25569 )	2025-04-01 14:23:08 +01:00
Daniel Bennett	99c25fc635	dhv: mkdir plugin parameters: uid,guid,mode (#25533 ) also remove Error logs from client rpc and promote plugin Debug logs to Error (since they have more info in them)	2025-03-28 10:13:13 -05:00
Martijn Vegter	736103aa54	client: fix JSON formatted logs when failing to reserve cores (#25523 ) Fixed a bug where JSON formatted logs would not show the requested and overlapping cores when failing to reserve cores	2025-03-27 08:52:32 -04:00
Crypto89	9c4e4afa79	csi: fix CSI ExpandVolume stagingPath (#25253 ) Fix the checking of the staging path against the mountRoot on the host rather then checking against the containerMountPoint which (probably) never exists on the host causing it to default back the the legacy behaviour.	2025-03-25 12:36:46 -05:00
Tim Gross	e168548341	provide allocrunner hooks with prebuilt taskenv and fix mutation bugs (#25373 ) Some of our allocrunner hooks require a task environment for interpolating values based on the node or allocation. But several of the hooks accept an already-built environment or builder and then keep that in memory. Both of these retain a copy of all the node attributes and allocation metadata, which balloons memory usage until the allocation is GC'd. While we'd like to look into ways to avoid keeping the allocrunner around entirely (see #25372), for now we can significantly reduce memory usage by creating the task environment on-demand when calling allocrunner methods, rather than persisting it in the allocrunner hooks. In doing so, we uncover two other bugs: * The WID manager, the group service hook, and the checks hook have to interpolate services for specific tasks. They mutated a taskenv builder to do so, but each time they mutate the builder, they write to the same environment map. When a group has multiple tasks, it's possible for one task to set an environment variable that would then be interpolated in the service definition for another task if that task did not have that environment variable. Only the service definition interpolation is impacted. This does not leak env vars across running tasks, as each taskrunner has its own builder. To fix this, we move the `UpdateTask` method off the builder and onto the taskenv as the `WithTask` method. This makes a shallow copy of the taskenv with a deep clone of the environment map used for interpolation, and then overwrites the environment from the task. * The checks hook interpolates Nomad native service checks only on `Prerun` and not on `Update`. This could cause unexpected deregistration and registration of checks during in-place updates. To fix this, we make sure we interpolate in the `Update` method. I also bumped into an incorrectly implemented interface in the CSI hook. I've pulled that and some better guardrails out to https://github.com/hashicorp/nomad/pull/25472. Fixes: https://github.com/hashicorp/nomad/issues/25269 Fixes: https://hashicorp.atlassian.net/browse/NET-12310 Ref: https://github.com/hashicorp/nomad/issues/25372	2025-03-24 12:05:04 -04:00
Tim Gross	c67c4ea182	client: statically assert hook interfaces in build (#25472 ) While working on #25373, I noticed that the CSI hook's `Destroy` method doesn't match the interface, which means it never gets called. Because this method only cancels any in-flight CSI requests, the only impact of this bug is that any CSI RPCs that are in-flight when an alloc is GC'd on the client or a dev agent is shut down won't be interrupted gracefully. Fix the interface, but also make static assertions for all the allocrunner hooks in the production code, so that you can make changes to interfaces and have compile-time assistance in avoiding mistakes. Ref: https://github.com/hashicorp/nomad/pull/25373	2025-03-21 09:14:13 -04:00
Michael Schurter	92de40b00d	tests: fixes a few data races in tests (#25455 ) * test: use statedb factory Swapping fields on Client after it has been created is a race. * test: lock before checking heartbeat state Fixes races * test: fix races by copying fsm objects A common source of data races in tests is when they insert a fixture directly into memdb and then later mutate the object. Since objects in the state store are readonly, any later mutation is a data race. * test: lock when peeking at eval stats * test: lock when peeking at serf state * test: lock when looking at stats * test: fix default eval broker state test The test was not applying the config callback. In addition the test raced against the configuration being applied. Waiting for the keyring to be initialized resolved the race in my testing, but given the high concurrency of the various leadership subsystems it's possible it may still flake.	2025-03-20 10:56:17 -07:00
Michael Smithhisler	d95a3766ae	client: fix client blocking during garbage collection (#25123 ) This change removes any blocking calls to destroyAllocRunner, which caused nomad clients to block when running allocations in certain scenarios. In addition, this change consolidates client GC by removing the MakeRoomFor method, which is redundant to keepUsageBelowThreshold. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-19 14:32:46 -04:00

1 2 3 4 5 ...

5099 Commits