nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-02 08:25:43 +03:00

Author	SHA1	Message	Date
James Rasell	2eb35a4678	build: Update Go to v1.24.1 (#25249 )	2025-03-06 10:33:14 +00:00
Piotr Kazmierczak	29c7b7ca44	stateful deployments: fix missing prefix search in claim list CLI (#25297 )	2025-03-06 11:23:00 +01:00
Michael Smithhisler	5c4d0e923d	consul: Remove legacy token based authentication workflow (#25217 )	2025-03-05 15:38:11 -05:00
Michael Smithhisler	f2b761f17c	disconnected: removes deprecated disconnect fields (#25284 ) The group level fields stop_after_client_disconnect, max_client_disconnect, and prevent_reschedule_on_lost were deprecated in Nomad 1.8 and replaced by field in the disconnect block. This change removes any logic related to those deprecated fields. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-05 14:46:02 -05:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
Piotr Kazmierczak	73a193f6d9	stateful deployments: task group host volume claims CLI (#25116 ) CLI for interacting with task group host volume claims.	2025-02-27 17:04:48 +01:00
Piotr Kazmierczak	58c6387323	stateful deployments: task group host volume claims API (#25114 ) This PR introduces API endpoints /v1/volumes/claims/ and /v1/volumes/claim/:id for listing and deleting task group host volume claims, respectively.	2025-02-25 15:51:59 +01:00
Tim Gross	4cdfa19b1e	volume status: default type to show both DHV and CSI volumes (#25185 ) The `-type` option for `volume status` is a UX papercut because for many clusters there will be only one sort of volume in use. Update the CLI so that the default behavior is to query CSI and/or DHV. This behavior is subtly different when the user provides an ID or not. If the user doesn't provide an ID, we query both CSI and DHV and show both tables. If the user provides an ID, we query DHV first and then CSI, and show only the appropriate volume. Because DHV IDs are UUIDs, we're sure we won't have collisions between the two. We only show errors if both queries return an error. Fixes: https://hashicorp.atlassian.net/browse/NET-12214	2025-02-24 11:38:07 -05:00
Tim Gross	d63c1d0bad	volumes: add capabilities to volume status (#25173 ) Job authors need to be able to review what capabilities a dynamic host volume or CSI volume has so that they can set the correct access mode and attachment mode in their job. Add these to the CLI output of `volume status`. Ref: https://hashicorp.atlassian.net/browse/NET-12063	2025-02-20 14:40:31 -05:00
James Rasell	32c25d3935	cli: Remove warning notes from Vault and Consul setup commands. (#25153 )	2025-02-19 09:18:42 +00:00
Tim Gross	dc58f247ed	docs: clarify reschedule, migrate, and replacement terminology (#24929 ) Our vocabulary around scheduler behaviors outside of the `reschedule` and `migrate` blocks leaves room for confusion around whether the reschedule tracker should be propagated between allocations. There are effectively five different behaviors we need to cover: * restart: when the tasks of an allocation fail and we try to restart the tasks in place. * reschedule: when the `restart` block runs out of attempts (or the allocation fails before tasks even start), and we need to move the allocation to another node to try again. * migrate: when the user has asked to drain a node and we need to move the allocations. These are not failures, so we don't want to propagate the reschedule tracker. * replacement: when a node is lost, we don't count that against the `reschedule` tracker for the allocations on the node (it's not the allocation's "fault", after all). We don't want to run the `migrate` machinery here here either, as we can't contact the down node. To the scheduler, this is effectively the same as if we bumped the `group.count` * replacement for `disconnect.replace = true`: this is a replacement, but the replacement is intended to be temporary, so we propagate the reschedule tracker. Add a section to the `reschedule`, `migrate`, and `disconnect` blocks explaining when each item applies. Update the use of the word "reschedule" in several places where "replacement" is correct, and vice-versa. Fixes: https://github.com/hashicorp/nomad/issues/24918 Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-18 09:31:03 -05:00
Tim Gross	7b89c0ee28	template: fix client's default retry configuration (#25113 ) In #20165 we fixed a bug where a partially configured `client.template` retry block would set any unset fields to nil instead of their default values. But this patch introduced a regression in the default values, so we were now defaulting to unlimited retries if the retry block was unset. Restore the correct behavior and add better test coverage at both the config parsing and template configuration code. Ref: https://github.com/hashicorp/nomad/pull/20165 Ref: https://github.com/hashicorp/nomad/issues/23305#issuecomment-2643731565	2025-02-14 09:25:41 -05:00
Jorge Marey	25426f0777	fingerprint: add config option to disable dmidecode (#25108 )	2025-02-13 11:20:48 -05:00
hc-github-team-nomad-core	ac36990fe3	Generate files for 1.9.6 release	2025-02-11 17:03:45 -05:00
Phil Renaud	9367929d87	[cli] Adds Actions to job status command output (#24959 ) * Adds Actions to job status command output * Adds Actions to job status command output * Status documentation updated to show actions and formatJobActions no longer cares about pipe delineation	2025-02-04 09:34:49 -05:00
Tim Gross	7929939116	volume delete: allow prefix for ID (#24997 ) The `volume delete` command doesn't allow using a prefix for the volume ID for either CSI or dynamic host volumes. Use a prefix search and wildcard namespace as we do for other CLI commands. Ref: https://hashicorp.atlassian.net/browse/NET-12057	2025-02-03 11:29:43 -05:00
Tim Gross	cc99e8f0a2	dynamic host volumes: add `-id` arg for updates of existing volumes (#24996 ) If you create a volume via `volume create/register` and want to update it later, you need to change the volume spec to add the ID that was returned. This isn't a very nice UX, so let's add an `-id` argument that allows you to update existing volumes that have that ID. Ref: https://hashicorp.atlassian.net/browse/NET-12083	2025-02-03 10:26:30 -05:00
Matt Keeler	833e240597	Upgrade to using hashicorp/go-metrics@v0.5.4 (#24856 ) * Upgrade to using hashicorp/go-metrics@v0.5.4 This also requires bumping the dependencies for: * memberlist * serf * raft * raft-boltdb * (and indirectly hashicorp/mdns due to the memberlist or serf update) Unlike some other HashiCorp products, Nomads root module is currently expected to be consumed by others. This means that it needs to be treated more like our libraries and upgrade to hashicorp/go-metrics by utilizing its compat packages. This allows those importing the root module to control the metrics module used via build tags.	2025-01-31 15:22:00 -05:00
Daniel Bennett	dcf6201d2b	dynamic host volumes: CE side of quota tweaks (#24972 ) * quota spec: if `region_limit.storage.host_volumes` is set, do not require that `variables` also be set, and vice versa. * subtract from quota usage on volume delete * stub CE quota subtraction method	2025-01-28 17:27:25 -06:00
Tim Gross	09eb473189	dynamic host volumes: set status unavailable on failed restore (#24962 ) When a client restarts but can't restore a volume (ex. the plugin is now missing), it's removed from the node fingerprint. So we won't allow future scheduling of the volume, but we were not updating the volume state field to report this reasoning to operators. Make debugging easier and the state field more meaningful by setting the value to "unavailable". Also, remove the unused "deleted" field. We did not implement soft deletes and aren't planning on it for Nomad 1.10.0. Ref: https://hashicorp.atlassian.net/browse/NET-11551	2025-01-27 16:35:53 -05:00
Daniel Bennett	49c147bcd7	dynamic host volumes: change env vars, fixup auto-delete (#24943 ) * plugin env: DHV_HOST_PATH->DHV_VOLUMES_DIR * client config: host_volumes_dir * plugin env: add namespace+nodepool * only auto-delete after error saving client state on initial create	2025-01-27 10:36:53 -06:00
Tim Gross	7add04eb0f	refactor: volume request modes to be generic between DHV/CSI (#24896 ) When we implemented CSI, the types of the fields for access mode and attachment mode on volume requests were defined with a prefix "CSI". This gets confusing now that we have dynamic host volumes using the same fields. Fortunately the original was a typedef on string, and the Go API in the `api` package just uses strings directly, so we can change the name of the type without breaking backwards compatibility for the msgpack wire format. Update the names to `VolumeAccessMode` and `VolumeAttachmentMode`. Keep the CSI and DHV specific value constant names for these fields (they aren't currently 1:1), so that we can easily differentiate in a given bit of code which values are valid. Ref: https://github.com/hashicorp/nomad/pull/24881#discussion_r1920702890	2025-01-24 10:37:48 -05:00
Tim Gross	3e7adba8f0	volume spec: fix `access_mode` field in examples (#24911 ) The `volume init` command creates example volume specifications. But one of the values for `capability.access_mode` is not a valid value. Correct the example to match the validation logic.	2025-01-22 09:30:49 -05:00
Michael Schurter	63dacd2d6e	update vault token warning from 1.9->1.10 (#24884 ) Fixes #24847	2025-01-17 10:56:06 -08:00
Tim Gross	96e539ee87	dynamic host volumes quotas (#24871 ) Allow users to configure a host volumes quota in MB. This will be enforced at the time of provisioning via create/register RPCs. This changeset is the CE version of ENT/2114. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2114 Ref: https://hashicorp.atlassian.net/browse/NET-11549	2025-01-17 11:41:56 -05:00
James Rasell	63ea13be77	agent: Ensure logger set up method is public. (#24886 ) This is needed by a Nomad Enterprise code path.	2025-01-17 13:47:06 +00:00
James Rasell	753f752cdd	agent: remove unused log filter and unrequired library. (#24873 ) The Nomad agent used a log filter to ensure logs were written at the expected level. Since the use of hclog this is not required, as hclog acts as the gate keeper and filter for logging. All log writers accept messages from hclog which has already done the filtering.	2025-01-17 07:51:27 +00:00
James Rasell	1ae9785f9b	agent: Fix a bug where all syslog lines are notice when using JSON (#24865 ) The agent syslog write handler was unable to handle JSON log lines correctly, meaning all syslog entries when using JSON log format showed as NOTICE level. This change adds a new handler to the Nomad agent which can parse JSON log lines and correctly understand the expected log level entry. The change also removes the use of a filter from the default log format handler. This is not needed as the logs are fed into the syslog handler via hclog, which is responsible for level filtering.	2025-01-16 07:23:08 +00:00
Tim Gross	46bd0b1716	dynamic host volume: set default capability (#24857 ) We can reduce the amount of volume specification configuration many users will need by setting a default capability on a dynamic host volume if none is set. The default capability will allow using the volume in read/write mode on its node, with no further restrictions except those that might be set in the jobspec.	2025-01-15 14:07:07 -05:00
James Rasell	8d201a82fd	agent: Fixed a bug where syslog error messages marked as notice. (#24820 ) The mapping between Nomad log level identifiers and syslog priorities did not handle the error level string correctly.	2025-01-15 08:02:53 +00:00
hc-github-team-nomad-core	b40200cefd	Generate files for 1.9.5 release	2025-01-14 12:31:18 -08:00
Tim Gross	3a11a0b1e1	quotas: refactor storage limit specification (#24785 ) In anticipation of having quotas for dynamic host volumes, we want the user experience of the storage limits to feel integrated with the other resource limits. This is currently prevented by reusing the `Resources` type instead of having a specific type for `QuotaResources`. Update the quota limit/usage types to use a `QuotaResources` that includes a new storage resources quota block. The wire format for the two types are compatible such that we can migrate the existing variables limit in the FSM. Also fixes improper parallelism in the quota init test where we change working directory to avoid file write conflicts but this breaks when multiple tests are executed in the same process. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2096	2025-01-13 09:25:00 -05:00
Tim Gross	4a65b21aab	dynamic host volumes: send register to client for fingerprint (#24802 ) When we register a volume without a plugin, we need to send a client RPC so that the node fingerprint can be updated. The registered volume also needs to be written to client state so that we can restore the fingerprint after a restart. Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-01-08 16:58:58 -05:00
Seth Hoenig	2bfe817721	Post 1.9.4 release (#24811 ) * Generate files for 1.9.4 release * Prepare for next release * Merge release 1.9.4 files --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2025-01-08 09:36:22 -06:00
Tim Gross	024c504a1e	dynamic host volumes: require node ID on register (#24795 ) When registering a host volume created out-of-band, the volume will have been created on a specific node. Require the node ID field to be set. Ref: https://github.com/hashicorp/nomad/pull/24789#discussion_r1904690799	2025-01-07 11:24:45 -05:00
Piotr Kazmierczak	0906f788f0	keyring: warn if removing a key that was used for encrypting variables (#24766 ) Adds an additional check in the Keyring.Delete RPC to make sure we're not trying to delete a key that's been used to encrypt a variable. It also adds a -force flag for the CLI/API to sidestep that check.	2025-01-07 10:15:02 +01:00
Daniel Bennett	459453917e	dynamic host volumes: client-side tests, comments, tidying (#24747 )	2025-01-06 13:20:07 -06:00
Charlie Voiselle	30ab8897d2	deps: Switch from mitchellh/cli to hashicorp/cli (#19321 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2024-12-19 15:41:11 +00:00
Piotr Kazmierczak	967addec48	stateful deployments: add corrections to API structs and methods (#24700 ) This changeset includes changes accidentally left out from 24641.	2024-12-19 09:25:54 -05:00
Tim Gross	fd05e461dd	dynamic host volumes: add -type flag to volume init (#24667 ) Adds a `-type` flag to the `volume init` command that generates an example volume specification with only those fields relevant to dynamic host volumes. This changeset also moves the string literals into uses of `go:embed` Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	76641c8081	dynamic host volumes: refactor HTTP routes for volumes list dispatch (#24612 ) The List Volumes API was originally written for CSI but assumed we'd have future volume types, dispatched on a query parameter. Dynamic host volumes uses this, but the resulting code has host volumes concerns comingled in the CSI volumes endpoint. Refactor this so that we have a top-level `GET /v1/volumes` route that's shared between CSI and DHV, and have it dispatch to the appropriate handler in the type-specific endpoints. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Daniel Bennett	5826e92671	dynamic host volumes: delete by single volume ID (#24606 ) string instead of []string	2024-12-19 09:25:54 -05:00
Tim Gross	787fbbe671	sentinel: remove default scope for Sentinel apply command (#24601 ) When we add a Sentinel scope for dynamic host volumes, having a default `-scope` value for `sentinel apply` risks accidentally adding policies for volumes to the job scope. This would immediately prevent any job from being submitted. Forcing the administrator to pass a `-scope` will prevent accidental misuse. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2087 Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	d700538921	dynamic host volumes: Sentinel improvements for CLI (#24592 ) The create/register volume RPCs support a policy override flag for soft-mandatory Sentinel policies, but the CLI and Go API were missing support for it. Also add support for Sentinel warnings to the Go API and CLI. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Daniel Bennett	46a39560bb	dynamic host volumes: fingerprint client plugins (#24589 )	2024-12-19 09:25:54 -05:00
Tim Gross	df258ac02a	dynamic host volumes: set namespace from volume spec when monitoring (#24586 ) In #24528 we added monitoring to the CLI for dynamic host volume creation. But when the volume's namespace is set by the volume specification instead of the `-namespace` flag, the API client doesn't have the right namespace and gets a 404 when setting up the monitoring. The specification always overrides the `-namespace` flag, so use that when available for all subsequent API calls. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	e3864a5f4a	dynamic host volumes: autocomplete for CLI (#24533 ) Adds dynamic host volumes to argument autocomplete for the `volume status` and `volume delete` commands. Adds flag autocompletion for those commands plus `volume create`. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	d1352b285d	dynamic host volumes: Enterprise stubs and refactor API (#24545 ) Most Nomad upsert RPCs accept a single object with the notable exception of CSI. But in CSI we don't actually expose this to users except through the Go API. It deeply complicates how we present errors to users, especially once Sentinel policy enforcement enters the mix. Refactor the `HostVolume.Create` and `HostVolume.Register` RPCs to take a single volume instead of a slice of volumes. Add a stub function for Enterprise policy enforcement. This requires splitting out placement from the `createVolume` function so that we can ensure we've completed placement before trying to enforce policy. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	298460dcd9	dynamic host volumes: monitor readiness from CLI (#24528 ) When creating a dynamic host volumes, set up an optional monitor that waits for the node to fingerprint the volume as healthy. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
Tim Gross	bbf49a9050	dynamic host volumes: node selection via constraints (#24518 ) When making a request to create a dynamic host volumes, users can pass a node pool and constraints instead of a specific node ID. This changeset implements a node scheduling logic by instantiating a filter by node pool and constraint checker borrowed from the scheduler package. Because host volumes with the same name can't land on the same host, we don't need to support `distinct_hosts`/`distinct_property`; this would be challenging anyways without building out a much larger node iteration mechanism to keep track of usage across multiple hosts. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00

1 2 3 4 5 ...

3901 Commits