nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-04 01:15:43 +03:00

Author	SHA1	Message	Date
Daniel Bennett	99c25fc635	dhv: mkdir plugin parameters: uid,guid,mode (#25533 ) also remove Error logs from client rpc and promote plugin Debug logs to Error (since they have more info in them)	2025-03-28 10:13:13 -05:00
Martijn Vegter	736103aa54	client: fix JSON formatted logs when failing to reserve cores (#25523 ) Fixed a bug where JSON formatted logs would not show the requested and overlapping cores when failing to reserve cores	2025-03-27 08:52:32 -04:00
Crypto89	9c4e4afa79	csi: fix CSI ExpandVolume stagingPath (#25253 ) Fix the checking of the staging path against the mountRoot on the host rather then checking against the containerMountPoint which (probably) never exists on the host causing it to default back the the legacy behaviour.	2025-03-25 12:36:46 -05:00
Tim Gross	e168548341	provide allocrunner hooks with prebuilt taskenv and fix mutation bugs (#25373 ) Some of our allocrunner hooks require a task environment for interpolating values based on the node or allocation. But several of the hooks accept an already-built environment or builder and then keep that in memory. Both of these retain a copy of all the node attributes and allocation metadata, which balloons memory usage until the allocation is GC'd. While we'd like to look into ways to avoid keeping the allocrunner around entirely (see #25372), for now we can significantly reduce memory usage by creating the task environment on-demand when calling allocrunner methods, rather than persisting it in the allocrunner hooks. In doing so, we uncover two other bugs: * The WID manager, the group service hook, and the checks hook have to interpolate services for specific tasks. They mutated a taskenv builder to do so, but each time they mutate the builder, they write to the same environment map. When a group has multiple tasks, it's possible for one task to set an environment variable that would then be interpolated in the service definition for another task if that task did not have that environment variable. Only the service definition interpolation is impacted. This does not leak env vars across running tasks, as each taskrunner has its own builder. To fix this, we move the `UpdateTask` method off the builder and onto the taskenv as the `WithTask` method. This makes a shallow copy of the taskenv with a deep clone of the environment map used for interpolation, and then overwrites the environment from the task. * The checks hook interpolates Nomad native service checks only on `Prerun` and not on `Update`. This could cause unexpected deregistration and registration of checks during in-place updates. To fix this, we make sure we interpolate in the `Update` method. I also bumped into an incorrectly implemented interface in the CSI hook. I've pulled that and some better guardrails out to https://github.com/hashicorp/nomad/pull/25472. Fixes: https://github.com/hashicorp/nomad/issues/25269 Fixes: https://hashicorp.atlassian.net/browse/NET-12310 Ref: https://github.com/hashicorp/nomad/issues/25372	2025-03-24 12:05:04 -04:00
Tim Gross	c67c4ea182	client: statically assert hook interfaces in build (#25472 ) While working on #25373, I noticed that the CSI hook's `Destroy` method doesn't match the interface, which means it never gets called. Because this method only cancels any in-flight CSI requests, the only impact of this bug is that any CSI RPCs that are in-flight when an alloc is GC'd on the client or a dev agent is shut down won't be interrupted gracefully. Fix the interface, but also make static assertions for all the allocrunner hooks in the production code, so that you can make changes to interfaces and have compile-time assistance in avoiding mistakes. Ref: https://github.com/hashicorp/nomad/pull/25373	2025-03-21 09:14:13 -04:00
Michael Schurter	92de40b00d	tests: fixes a few data races in tests (#25455 ) * test: use statedb factory Swapping fields on Client after it has been created is a race. * test: lock before checking heartbeat state Fixes races * test: fix races by copying fsm objects A common source of data races in tests is when they insert a fixture directly into memdb and then later mutate the object. Since objects in the state store are readonly, any later mutation is a data race. * test: lock when peeking at eval stats * test: lock when peeking at serf state * test: lock when looking at stats * test: fix default eval broker state test The test was not applying the config callback. In addition the test raced against the configuration being applied. Waiting for the keyring to be initialized resolved the race in my testing, but given the high concurrency of the various leadership subsystems it's possible it may still flake.	2025-03-20 10:56:17 -07:00
Michael Smithhisler	d95a3766ae	client: fix client blocking during garbage collection (#25123 ) This change removes any blocking calls to destroyAllocRunner, which caused nomad clients to block when running allocations in certain scenarios. In addition, this change consolidates client GC by removing the MakeRoomFor method, which is redundant to keepUsageBelowThreshold. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-19 14:32:46 -04:00
Michael Smithhisler	4eb294e1ef	client: skip shutdown delay when tasks already deregistered (#25157 ) --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-19 14:15:35 -04:00
Daniel Bennett	04db81951f	test: fix go 1.24 test complaints (#25346 ) e.g. Error: nomad/leader_test.go:382:12: non-constant format string in call to (*testing.common).Fatalf	2025-03-11 11:01:39 -05:00
Tim Gross	1ffb7ab3fb	dynamic host volumes: allow plugins to return an error message (#25341 ) Errors from `volume create` or `volume delete` only get logged by the client agent, which may make it harder for volume authors to debug these tasks if they are not also the cluster administrator with access to host logs. Allow plugins to include an optional error message in their response. Because we can't count on receiving this response (the error could come before the plugin executes), we parse this message optimistically and include it only if available. Ref: https://hashicorp.atlassian.net/browse/NET-12087	2025-03-11 11:06:57 -04:00
Tim Gross	f3d53e3e2b	CSI: restart task on failing initial probe, instead of killing it (#25307 ) When a CSI plugin is launched, we probe it until the csi_plugin.health_timeout expires (by default 30s). But if the plugin never becomes healthy, we're not restarting the task as documented. Update the plugin supervisor to trigger a restart instead. We still exit the supervisor loop at that point to avoid having the supervisor send probes to a task that isn't running yet. This requires reworking the poststart hook to allow the supervisor loop to be restarted when the task restarts. In doing so, I identified that we weren't respecting the task kill context from the post start hook, which would leave the supervisor running in the window between when a task is killed because it failed and its stop hooks were triggered. Combine the two contexts to make sure we stop the supervisor whichever context gets closed first. Fixes: https://github.com/hashicorp/nomad/issues/25293 Ref: https://hashicorp.atlassian.net/browse/NET-12264	2025-03-07 10:04:59 -05:00
James Rasell	c0eccda4f7	template: Set any Consul token generated by workload identity. (#25309 )	2025-03-07 14:32:02 +00:00
Michael Smithhisler	5c4d0e923d	consul: Remove legacy token based authentication workflow (#25217 )	2025-03-05 15:38:11 -05:00
Michael Smithhisler	f2b761f17c	disconnected: removes deprecated disconnect fields (#25284 ) The group level fields stop_after_client_disconnect, max_client_disconnect, and prevent_reschedule_on_lost were deprecated in Nomad 1.8 and replaced by field in the disconnect block. This change removes any logic related to those deprecated fields. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-05 14:46:02 -05:00
grembo	b6d925987c	Allow disabling wait in client configuration (#25255 ) Before the fixes in #20165, the wait feature was disabled by default. After these changes, it's always enabled, which - at least on some platforms - leads to a significant increase in load (5-7x). This patch allows disabling the wait feature in the client stanza of the configuration file by setting min and max to 0: wait { min = "0" max = "0" } Per-template wait blocks in the task description still work like one would expect.	2025-03-03 16:38:46 -05:00
Tim Gross	1788bfb42e	remove addresses from node class hash (#24942 ) When a node is fingerprinted, we calculate a "computed class" from a hash over a subset of its fields and attributes. In the scheduler, when a given node fails feasibility checking (before fit checking) we know that no other node of that same class will be feasible, and we add the hash to a map so we can reject them early. This hash cannot include any values that are unique to a given node, otherwise no other node will have the same hash and we'll never save ourselves the work of feasibility checking those nodes. In #4390 we introduce the `nomad.advertise.address` attribute and in #19969 we introduced `consul.dns.addr` attribute. Both of these are unique per node and break the hash. Additionally, we were not correctly filtering attributes out when checking if a node escaped the class by not filtering for attributes that start with `unique.`. The test for this introduced in #708 had an inverted assertion, which allowed this to pass unnoticed since the early days of Nomad. Ref: https://github.com/hashicorp/nomad/pull/708 Ref: https://github.com/hashicorp/nomad/pull/4390 Ref: https://github.com/hashicorp/nomad/pull/19969	2025-03-03 09:28:32 -05:00
David	c52623d7d4	client: remove unused nodeID parameter from host stats metric functions (#25247 )	2025-02-28 10:29:38 +01:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
Tim Gross	e02ef73abf	fingerprint: only log failed Consul fingerprint once (#25182 ) In #24526 we updated Consul and Vault fingerprinting so that we no longer periodically fingerprint. In #25102 we made it so that we fingerprint periodically on start until the first fingerprint, in order to tolerate Consul or Vault not being available on start. For clusters not running Consul, this leads to a warn-level log every 15s. This same log exists for Vault, but Vault support is opt-in via `vault.enable = true` whereas you have to manually disable the fingerprinter for Consul. Make it so that we only log a failed Consul fingerprint once per Consul cluster. Reset the gate on this once we have a successful fingerprint, so that we get the logs after a reload if Consul is unavailable. Ref: https://github.com/hashicorp/nomad/pull/24526 Ref: https://github.com/hashicorp/nomad/pull/25102 Fixes: https://github.com/hashicorp/nomad/issues/25181	2025-02-21 13:09:34 -05:00
Tim Gross	7b89c0ee28	template: fix client's default retry configuration (#25113 ) In #20165 we fixed a bug where a partially configured `client.template` retry block would set any unset fields to nil instead of their default values. But this patch introduced a regression in the default values, so we were now defaulting to unlimited retries if the retry block was unset. Restore the correct behavior and add better test coverage at both the config parsing and template configuration code. Ref: https://github.com/hashicorp/nomad/pull/20165 Ref: https://github.com/hashicorp/nomad/issues/23305#issuecomment-2643731565	2025-02-14 09:25:41 -05:00
Tim Gross	8c57fd5eb0	fingerprint: initial fingerprint of Vault/Consul should be periodic (#25102 ) In #24526 we updated the Consul and Vault fingerprints so that they are no longer periodic. This fixed a problem that cluster admins reported where rolling updates of Vault or Consul would cause a thundering herd of fingerprint updates across the whole cluster. But if Consul/Vault is not available during the initial fingerprint, it will never get fingerprinted again. This is challenging for cluster updates and black starts because the implicit service startup ordering may require reloads. Instead, have the fingerprinter run periodically but mark that it has made its first successful fingerprint of all Consul/Vault clusters. At that point, we can skip further periodic updates. The `Reload` method will reset the mark and allow the subsequent fingerprint to run normally. Fixes: https://github.com/hashicorp/nomad/issues/25097 Ref: https://github.com/hashicorp/nomad/pull/24526 Ref: https://github.com/hashicorp/nomad/issues/24049	2025-02-13 14:26:04 -05:00
Jorge Marey	25426f0777	fingerprint: add config option to disable dmidecode (#25108 )	2025-02-13 11:20:48 -05:00
Tim Gross	716df52788	CNI: migrate from persistent state to ephemeral state during restart (#25093 ) In #24650 we switched to using ephemeral state for CNI plugins, so that when a host reboots and we lose all the allocations we don't end up trying to use IPs we created in network namespaces we just destroyed. Unfortunately upgrade testing missed that in a non-reboot scenario, the existing CNI state was being used by plugins like the ipam plugin to hand out the "next available" IP address. So with no state carried over, we might allocate new addresses that conflict with existing allocations. (This can be avoided by draining the node first.) As a compatibility shim, copy the old CNI state directory to the new CNI state directory during agent startup, if the new CNI state directory doesn't already exist. Ref: https://github.com/hashicorp/nomad/pull/24650	2025-02-12 09:25:50 -05:00
Tim Gross	5d09d7ad07	reset max query time of blocking queries in client after retries (#25039 ) When a blocking query on the client hits a retryable error, we change the max query time so that it falls within the `RPCHoldTimeout` timeout. But when the retry succeeds we don't reset it to the original value. Because the calls to `Node.GetClientAllocs` reuse the same request struct instead of reallocating it, any retry will cause the agent to poll at a faster frequency until the agent restarts. No other current RPC on the client has this behavior, but we'll fix this in the `rpc` method rather than in the caller so that any future users of the `rpc` method don't have to remember this detail. Fixes: https://github.com/hashicorp/nomad/issues/25033	2025-02-07 08:45:56 -05:00
Tim Gross	b5faeff233	vault: fix bug in logging logic around renewals (#25040 ) In #24409 we fixed a bug where some of the error messages we get from Vault weren't being caught correctly. This fix itself contains a bug where we changed the logic that logged the non-fatal errors so that it logs when there is no renewal error. Ref: https://github.com/hashicorp/nomad/pull/24409 Fixes: https://github.com/hashicorp/nomad/issues/24933	2025-02-07 08:45:33 -05:00
Matt Keeler	833e240597	Upgrade to using hashicorp/go-metrics@v0.5.4 (#24856 ) * Upgrade to using hashicorp/go-metrics@v0.5.4 This also requires bumping the dependencies for: * memberlist * serf * raft * raft-boltdb * (and indirectly hashicorp/mdns due to the memberlist or serf update) Unlike some other HashiCorp products, Nomads root module is currently expected to be consumed by others. This means that it needs to be treated more like our libraries and upgrade to hashicorp/go-metrics by utilizing its compat packages. This allows those importing the root module to control the metrics module used via build tags.	2025-01-31 15:22:00 -05:00
Michael Smithhisler	47c14ddf28	remove remote task execution code (#24909 )	2025-01-29 08:08:34 -05:00
Gabi	e107d84c78	taskrunner: fix panic when a task that has a dynamic user is recovered (#24739 )	2025-01-27 13:05:55 -05:00
Daniel Bennett	49c147bcd7	dynamic host volumes: change env vars, fixup auto-delete (#24943 ) * plugin env: DHV_HOST_PATH->DHV_VOLUMES_DIR * client config: host_volumes_dir * plugin env: add namespace+nodepool * only auto-delete after error saving client state on initial create	2025-01-27 10:36:53 -06:00
Seth Hoenig	1356880962	fingerprint: convert consul and vault fingerprinters to be reloadable (#24526 ) This PR changes the Consul and Vault fingerprint implementations to be reloadable rather than periodic. Reasons described in the issue.	2025-01-27 09:20:01 +00:00
Tim Gross	7add04eb0f	refactor: volume request modes to be generic between DHV/CSI (#24896 ) When we implemented CSI, the types of the fields for access mode and attachment mode on volume requests were defined with a prefix "CSI". This gets confusing now that we have dynamic host volumes using the same fields. Fortunately the original was a typedef on string, and the Go API in the `api` package just uses strings directly, so we can change the name of the type without breaking backwards compatibility for the msgpack wire format. Update the names to `VolumeAccessMode` and `VolumeAttachmentMode`. Keep the CSI and DHV specific value constant names for these fields (they aren't currently 1:1), so that we can easily differentiate in a given bit of code which values are valid. Ref: https://github.com/hashicorp/nomad/pull/24881#discussion_r1920702890	2025-01-24 10:37:48 -05:00
Piotr Kazmierczak	3d7e4fd634	client: always initialize node.HostVolumes map (#24910 ) The default node configuration in the client should always set an empty HostVolumes map. Otherwise callers can panic, e.g.,: goroutine 179 [running]: github.com/hashicorp/nomad/client/hostvolumemanager.UpdateVolumeMap({0x36042b0, 0xc000c62a80}, 0x0, {0xc000a802a0, 0xd}, 0xc000691940) github.com/hashicorp/nomad/client/hostvolumemanager/volume_fingerprint.go:43 +0x1b2 github.com/hashicorp/nomad/client.(Client).batchFirstFingerprints.func1({0xc000a802a0, 0xd}, 0xc000691940) github.com/hashicorp/nomad/client/node_updater.go:54 +0xd7 github.com/hashicorp/nomad/client.(batchNodeUpdates).batchHostVolumeUpdates(0xc000912608?, 0xc0009f2f88) github.com/hashicorp/nomad/client/node_updater.go:417 +0x152 github.com/hashicorp/nomad/client.(*Client).batchFirstFingerprints(0xc000c2d188) github.com/hashicorp/nomad/client/node_updater.go:53 +0x1c5 created by github.com/hashicorp/nomad/client.NewClient in goroutine 1 github.com/hashicorp/nomad/client/client.go:557 +0x2069 is a panic of the HVM when restarting a client that doesn't have any static host volumes, but does have a dynamic host volume.	2025-01-21 20:45:04 +01:00
James Rasell	689f935e0a	services: Support TLS Skip Verify within Nomad service checks. (#24781 ) Checks within a service using the Nomad provider can now utilise the `tls_skip_verify` parameter.	2025-01-15 07:39:39 +00:00
Daniel Bennett	985eb53c65	dynamic host volumes: plugin spec tweaks (#24848 ) * prefix plugin env vars with DHV_ * add env: DHV_VOLUME_ID, DHV_PLUGIN_DIR * 5s timeout on fingerprint calls	2025-01-13 14:18:10 -06:00
Tim Gross	cca9a5320d	testing: fix test flake in dynamic host volume client tests (#24836 ) The output of `GetDynamicHostVolumes` is a slice but that slice is constructed from iterating over a map and isn't sorted. Sort the output in the test to eliminate a test flake.	2025-01-10 14:48:05 -05:00
Michael Smithhisler	606ce9dd90	deps: upgrade aws-sdk-go from v1 to v2 (#24720 )	2025-01-09 17:27:19 -05:00
Tim Gross	4a65b21aab	dynamic host volumes: send register to client for fingerprint (#24802 ) When we register a volume without a plugin, we need to send a client RPC so that the node fingerprint can be updated. The registered volume also needs to be written to client state so that we can restore the fingerprint after a restart. Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-01-08 16:58:58 -05:00
Piotr Kazmierczak	7726ae68c6	client: move 'waiting for previous alloc to terminate' log messages to info (#24804 )	2025-01-08 15:44:35 +01:00
Michael Smithhisler	34a34e7233	plugins: validate logmon process during reattach (#24798 )	2025-01-08 08:50:33 -05:00
Tim Gross	08a6f870ad	cni: use check command when restoring from restart (#24658 ) When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon. When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail). This should fix the gap left #24650 for Docker task drivers and any other drivers with the `MustInitiateNetwork` capability. Fixes: https://github.com/hashicorp/nomad/issues/24292 Ref: https://github.com/hashicorp/nomad/pull/24650	2025-01-07 09:38:39 -05:00
Daniel Bennett	a9ee66a6ef	dynamic host volumes: unique volume name per node (#24748 ) a node can have only one volume with a given name. the scheduler prevents duplicates, but can only do so after the server knows about the volume. this prevents multiple concurrent creates being called faster than the fingerprint/heartbeat interval. users may still modify an existing volume only if they set the `id` in the volume spec and re-issue `nomad volume create` if a static vol is added to config with a name already being used by a dynamic volume, the dynamic takes precedence, but log a warning.	2025-01-06 15:37:20 -06:00
Daniel Bennett	459453917e	dynamic host volumes: client-side tests, comments, tidying (#24747 )	2025-01-06 13:20:07 -06:00
Charles Z.	f7b12dc54e	add noswap to secretdir tmpfs (#24645 )	2025-01-06 09:44:43 -05:00
Daniel Bennett	af967184a6	dynamic host volumes: tweak plugin fingerprint (#24711 ) Instead of a plugin `version` subcommand that responds with a string (established in #24497), respond to a `fingerprint` command with a data structure that we may extend in the future (such as plugin capabilities, like size constraint support?). In the immediate term, it's still just the version: `{"version": "0.0.1"}` In addition to leaving the door open for future expansion, I think it will also avoid false positives detecting executables that just happen to respond to a `version` command. This also reverses the ordering of the fingerprint string parts from `plugins.host_volume.version.mkdir` (which aligned with CNI) to `plugins.host_volume.mkdir.version` (makes more sense to me)	2024-12-19 09:25:55 -05:00
Daniel Bennett	e76f5e0b4c	dynamic host volumes: volume fingerprinting (#24613 ) and expand the demo a bit	2024-12-19 09:25:54 -05:00
Daniel Bennett	05f1cda594	dynamic host volumes: client state (#24595 ) store dynamic host volume creations in client state, so they can be "restored" on agent restart. restore works by repeating the same Create operation as initial creation, and expecting the plugin to be idempotent. this is (potentially) especially important after host restarts, which may have dropped mount points or such.	2024-12-19 09:25:54 -05:00
Daniel Bennett	46a39560bb	dynamic host volumes: fingerprint client plugins (#24589 )	2024-12-19 09:25:54 -05:00
Daniel Bennett	2b04d47ac2	dynamic host volumes: test client RPC and plugins (#24535 ) also ensure that volume ID is uuid-shaped so user-provided input like `id = "../../../"` which is used as part of the target directory can not find its way very far into the volume submission process	2024-12-19 09:25:54 -05:00
Daniel Bennett	c2dd97dee7	HostVolumePlugin interface and two implementations (#24497 ) * mkdir: HostVolumePluginMkdir: just creates a directory * example-host-volume: HostVolumePluginExternal: plugin script that does mkfs and mount loopback Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-12-19 09:25:54 -05:00
Tim Gross	6a3803c31e	dynamic host volumes: RPC handlers (#24373 ) This changeset implements the RPC handlers for Dynamic Host Volumes, including the plumbing needed to forward requests to clients. The client-side implementation is stubbed and will be done under a separate PR. Ref: https://hashicorp.atlassian.net/browse/NET-11549	2024-12-19 09:25:54 -05:00

1 2 3 4 5 ...

5056 Commits