nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
James Rasell	df16c96a9f	cli: use same offset when following single or multiple alloc logs. (#18604 )	2023-10-03 08:43:14 +01:00
Piotr Kazmierczak	3d62438876	consul: consul taskrunner hook should only write tokens that belong to its task (#18635 ) Ref hashicorp/team-nomad#404	2023-10-02 19:49:02 +02:00
Piotr Kazmierczak	62a0768775	consul: make service and task identity names unique (#18634 ) Ref: hashicorp/team-nomad#404	2023-10-02 19:48:34 +02:00
Kevin Wang	e7b70adc2c	cli: improve `job` and `status` text (#18628 )	2023-10-02 10:31:57 -04:00
dependabot[bot]	ccafb94645	chore(deps): bump github.com/cyphar/filepath-securejoin (#18545 )	2023-10-02 08:25:35 +01:00
Luiz Aoqui	7267be719f	config: apply defaults to extra Consul and Vault (#18623 ) * config: apply defaults to extra Consul and Vault Apply the expected default values when loading additional Consul and Vault cluster configuration. Without these defaults some fields would be left empty. * config: retain pointer of multi Consul and Vault When calling `Copy()` the pointer reference from the `"default"` key of the `Consuls` and `Vaults` maps to the `Consul` and `Vault` field of `Config` was being lost. * test: ensure TestAgent has the right reference to the default Consul config	2023-09-29 17:15:20 -03:00
Michael Schurter	3f9bd17687	client: prevent watching stale alloc state (#18612 ) When waiting on a previous alloc we must query against the leader before switching to a stale query with index set. Also check to ensure the response is fresh before using it like #18269	2023-09-29 12:46:28 -07:00
Tim Gross	aaee3076c2	consul: allow `consul` block in task scope (#18597 ) To support Workload Identity with Consul for templates, we want templates to be able to use the WI created at the task scope (either implicitly or set by the user). But to allow different tasks within a group to be assigned to different clusters as we're doing for Vault, we need to be able to set the `consul` block with its `cluster` field at the task level to override the group.	2023-09-29 15:03:48 -04:00
Phil Renaud	8da40465af	fallback to get definition if submission 404s when restarting job in ui (#18621 )	2023-09-29 14:52:21 -04:00
Phil Renaud	badaecea66	Access Control CRUD: Make name fields for Policies and Roles required (#18605 )	2023-09-29 12:33:03 -04:00
Piotr Kazmierczak	5dab41881b	client: new consul_hook (#18557 ) This PR introduces a new allocrunner-level consul_hook which iterates over services and tasks, if their provider is consul, fetches consul tokens for all of them, stores them in AllocHookResources and in task secret dirs. Ref: hashicorp/team-nomad#404 --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-09-29 17:41:48 +02:00
Piotr Kazmierczak	0a75a42d94	WI: WIDMgr should expose default identity signatures (#18610 ) Since the identity_hook is meant to be the central place that makes signed identities available to other hooks, it should also expose the default identity that is signed by the plan applier. Ref: hashicorp/team-nomad#404	2023-09-29 15:17:59 +02:00
James Rasell	b44cef0e66	docs: make upgrade version detail clearer. (#18608 ) Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-09-29 08:31:14 +01:00
Luiz Aoqui	a4b29a29cb	vault: add `jwt_backend_path` agent config (#18606 ) Add agent configuration to allow cluster operators to define the path where the JWT auth method backend is mounted.	2023-09-28 18:02:30 -03:00
Tim Gross	5001bf4547	consul: use constant instead of "default" literal (#18611 ) Use the constant `structs.ConsulDefaultCluster` instead of the string literal "default", as we've done for Vault.	2023-09-28 16:50:21 -04:00
Juana De La Cuesta	e0407b4cf2	server: Add reporting configuration to nomad server (#18609 ) * func: add reporting config to server * func: add reporting manager for ce * func: add reporting to quick tests	2023-09-28 22:00:43 +02:00
Michael Schurter	e73026dd4c	client: prevent using stale allocs (#18601 ) Similar to #18269, it is possible that even if Node.GetClientAllocs retrieves fresh allocs that the subsequent Alloc.GetAllocs call retrieves stale allocs. While `diffAlloc(existing, updated)` properly ignores stale alloc updates, alloc deletions have no such check. So if a client retrieves an alloc created at index 123, and then a subsequent Alloc.GetAllocs call hits a new server which returns results at index 100, the client will stop the alloc created at 123 because it will be missing from the stale response. This change applies the same logic as #18269 and ensures only fresh responses are used. Glossary: * fresh - modified at an index > the query index * stale - modified at an index <= the query index	2023-09-28 11:42:57 -07:00
Phil Renaud	859087640a	[ui] Simplify times in task events (#18595 ) * Regexy time simplification in task events * Oops, dont assume these are all task restart messages * Update mirage to provide displayMessage instead of message * Have a few acceptance tests look for .displayMessage instead of .message for equality now	2023-09-27 17:01:34 -04:00
Luiz Aoqui	fed1992cea	vault: remove `use_identity` agent config (#18592 ) The initial intention behind the `vault.use_identity` configuration was to indicate to Nomad servers that they would need to sign a workload identities for allocs with a `vault` block. But in order to support identity renewal, #18262 and #18431 moved the token signing logic to the alloc runner since a new token needs to be signed prior to the TTL expiring. So #18343 implemented `use_identity` as a flag to indicate that the workload identity JWT flow should be used when deriving Vault tokens for tasks. But this configuration value is set on servers so it is not available to clients at the time of token derivation, making its meaning not clear: a job may end up using the identity-based flow even when `use_identity` is `false`. The only reliable signal available to clients at token derivation time is the presence of an `identity` block for Vault, and this is already configured with the `vault.default_identity` configuration block, making `vault.use_identity` redundant. This commit removes the `vault.use_identity` configuration and simplifies the logic on when an implicit Vault identity is injected into tasks.	2023-09-27 17:44:07 -03:00
Luiz Aoqui	868aba57bb	vault: update identity name to start with `vault_` (#18591 ) * vault: update identity name to start with `vault_` In the original proposal, workload identities used to derive Vault tokens were expected to be called just `vault`. But in order to support multiple Vault clusters it is necessary to associate identities with specific Vault cluster configuration. This commit implements a new proposal to have Vault identities named as `vault_<cluster>`.	2023-09-27 15:53:28 -03:00
Phil Renaud	ef7bccbd40	[ui] ACL Roles in the UI, plus Role, Policy and Token management (#17770 ) * Rename pages to include roles * Models and adapters * [ui] Any policy checks in the UI now check for roles' policies as well as token policies (#18346) * combinedPolicies as a concept * Classic decorator on role adapter * We added a new request for roles, so the test based on a specific order of requests got fickle fast * Mirage roles cluster scaffolded * Acceptance test for roles and policies on the login page * Update mirage mock for nodes fetch to account for role policies / empty token.policies * Roles-derived policies checks * [ui] Access Control with Roles and Tokens (#18413) * top level policies routes moved into access control * A few more routes and name cleanup * Delog and test fixes to account for new url prefix and document titles * Overview page * Tokens and Roles routes * Tokens helios table * Add a role * Hacky role page and deletion * New policy keyboard shortcut and roles breadcrumb nav * If you leave New Role but havent made any changes, remove the newly-created record from store * Roles index list and general role route crud * Roles index actually links to roles now * Helios button styles for new roles and policies * Handle when you try to create a new role without having any policies * Token editing generally * Create Token functionality * Cant delete self-token but management token editing and deleting is fine * Upgrading helios caused codemirror to explode, shimmed * Policies table fix * without bang-element condition, modifier would refire over and over * Token TTL or Time setting * time will take you on * Mirage hooks for create and list roles * Ensure policy names only use allow characters in mirage mocks * Mirage mocked roles and policies in the default cluster * log and lintfix * chromedriver to 2.1.2 * unused unit tests removed * Nice profile dropdown * With the HDS accordion, rename our internal component scss ref * design revisions after discussion * Tooltip on deleted-policy tokens * Two-step button peripheral isDeleting gcode removed * Never to null on token save * copywrite headers added and empty routefiles removed * acceptance test fixes for policies endpoint * Route for updating a token * Policies testfixes * Ember on-click-outside modifier upgraded with general ember-modifier upgrade * Test adjustments to account for new profile header dropdown * Test adjustments for tokens via policy pages * Removed an unused route * Access Control index page tests * a11y tests * Tokens index acceptance tests generally * Lintfix * Token edit page tests * Token editing tests * New token expiration tests * Roles Index tests * Role editing policies tests * A complete set of Access Control Roles tests * Policies test * Be more specific about which row to check for expiration time * Nil check on expirationTime equality * Management tokens shouldnt show No Roles/Policies, give them their own designation * Route guard on selftoken, conditional columns, and afterModel at parent to prevent orphaned policies on tokens/roles from stopping a new save * Policy unloading on delete and other todos plus autofocus conditionally re-enabled * Invalid policies non-links now a concept for Roles index * HDS style links to make job.variables.alert links look like links again * Mirage finding looks weird so making model async in hash even though redundant * Drop rsvp * RSVP wasnt the problem, cached lookups were * remove old todo comments * de-log	2023-09-27 14:53:09 -04:00
Luiz Aoqui	19241964a4	config: fix some issues with workload identity and multi Consul and Vault (#18590 ) * config: fix multi consul and vault config parse Capture the loop variable when parsing multiple Consul and Vault configuration blocks so the duration parse function uses the correct field when it's called later on. * client: build Vault client with right config When setting up the multiple Vault clients, the code was always loading the default configuration, resulting in all clients to be configured the same way. * config: fix WorkloadIdentityConfig.Copy() method Ensure `WorkloadIdentityConfig.Copy()` does not return the original pointer for the `TTL` field.	2023-09-27 14:41:11 -03:00
Juana De La Cuesta	124272c050	server: Add reporting option to agent (#18572 ) * func: add reporting option to agent * func: add test for merge and fix comments * Update config_ce.go * Update config_ce.go * Update config_ce.go * fix: add reporting config to default configuration and update to use must over require * Update command/agent/config_parse.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update nomad/structs/config/reporting.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update nomad/structs/config/reporting.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * style: rename license and reporting config * fix: use default function instead of empty struct --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-09-27 00:11:32 +02:00
Daniel Bennett	9b74e11f06	csi: fix volume updating behavior (#18588 ) fix for part of: `c6dbba7cde` which allowed updating volumes while in use, because CSI expand may occur while in use. but it mistakenly stopped copying other important values that may be updated whether in use or not. this moves some of the in-use validation to happen during Merge(), before writing to state, leaving UpsertCSIVolume with only minimal final sanity-checking.	2023-09-26 14:58:17 -05:00
Daniel Bennett	fab968a748	csi: document volume expansion (#18573 ) and show Capacity in `volume status` command.	2023-09-26 14:49:15 -05:00
Tim Gross	02a5aab359	consul: provide workload's Consul token to service client (#18559 ) This is a work-in-progress changeset to provide workload-specific Consul tokens that are created by the `consul_hook` and attached to workload registration requests by the `group_service_hook` and `service_hook`. This requires unreleased updates to Consul's `api` package, so this changeset includes a temporary `replace` directive in the go.mod file.	2023-09-26 14:13:29 -04:00
Tim Gross	8b8a9f6497	WI: add test to verify we don't allow empty signatures for JWT (#18586 ) Encoded JWTs include an `alg` header key that tells the verifier which signature algorithm to use. Bafflingly, the JWT standard allows a value of `"none"` here which bypasses signature verification. In all shipped versions of Nomad, we explicitly configure verification to a specific algorithm and ignore the header value entirely to avoid this protocol flaw. But in #18123 we updated our JWT library to `go-jose`, which rightfully doesn't support `"none"` but this detail isn't encoded anywhere in our code base. Add a test that ensures we catch any regressions in the library.	2023-09-26 14:09:57 -04:00
Jose Merchan	20f6ec75ef	Update consul-connect.mdx (#18575 ) The hyperlink points to a non-existing URL. I suggest change it for this one (https://developer.hashicorp.com/consul/docs/install/ports) which at least listed the port 8503 (grpc tls)	2023-09-26 10:04:54 +01:00
Tim Gross	20eadc7b29	config: move Consul getter out of fingerprinter (#18556 )	2023-09-22 10:58:39 -04:00
Daniel Bennett	7bd5c6e84e	test: Refactor mock CSI manager (#18554 ) and MockCSIManager to support the call counting that csi_hook_test expects instead of implementing csimanager interfaces in two separate places: * client/allocrunner/csi_hook_test * client/csi_endpoint_test they can both use the same mocks defined in client/pluginmanager/csimanager/ alongside the actual implementations of them. also refactor TestCSINode_DetachVolume to use use it like Node_ExpandVolume so we can also test the happy path there	2023-09-21 16:03:53 -05:00
Charlie Voiselle	70fc8df787	[sentinel] Add existing job to enforceSubmitJob (#18553 ) * Add existing job to enforceSubmitJob (CE) * Add changelog	2023-09-21 14:12:51 -04:00
Juana De La Cuesta	72acaf6623	[17449] Introduces a locking mechanism over variables (#18207 ) It includes the work over the state store, the PRC server, the HTTP server, the go API package and the CLI's command. To read more on the actuall functionality, refer to the RFCs [NMD-178] Locking with Nomad Variables and [NMD-179] Leader election using locking mechanism for the Autoscaler.	2023-09-21 17:56:33 +02:00
Piotr Kazmierczak	86d2cdcf80	client: split identity_hook across allocrunner and taskrunner (#18431 ) This commit splits identity_hook between the allocrunner and taskrunner. The allocrunner-level part of the hook signs each task identity, and the taskrunner-level part picks it up and stores secrets for each task. The code revamps the WIDMgr, which is now split into 2 interfaces: IdentityManager which manages renewals of signatures and handles sending updates to subscribers via Watch method, and IdentitySigner which only does the signing. This work is necessary for having a unified Consul login workflow that comes with the new Consul integration. A new, allocrunner-level consul_hook will now be the only hook doing Consul authentication.	2023-09-21 17:31:27 +02:00
Phil Renaud	cf8dde0850	[ui] Color indicators for server/client status (#18318 ) * Color the status cell for servers and nodes * Testfix and changelog * Leader indicator moved post-word * Icon and badge treatment * Capitalizing test checks * HDS badges dont expose statusClass like we used to, so stop checking for it	2023-09-20 17:05:04 -04:00
Tim Gross	d7bd47d60f	config: remove `consul.template_identity` in lieu of `task_identity` (#18540 ) The original thinking for Workload Identity integration with Consul and Vault was that we'd allow `template` blocks to specify their own identity. But because the login to Consul/Vault to get tokens happens at the task level, this would involve making the `template` block a new WID watcher on its own rather than using the Consul and Vault hooks we're building at the group/task level. So it doesn't make sense to have separate identities for individual `template` blocks rather than at the level of tasks. Update the agent configuration to rename the `template_identity` to the more accurate `task_identity`, which will be used for any non-service hooks (just `template` today). Update the implicit identities job mutation hook to create the identity we'll need as well.	2023-09-20 15:43:08 -04:00
Tim Gross	fdc6c2151d	vault: select Vault API client by cluster name (#18533 ) Nomad Enterprise will support configuring multiple Vault clients. Instead of having a single Vault client field in the Nomad client, we'll have a function that callers can parameterize by the Vault cluster name that returns the correctly configured Vault API client wrapper.	2023-09-19 14:35:01 -04:00
Tim Gross	fcb9c4a39c	job endpoint: implicit constraints for multi-Vault/Consul (#18528 ) Update the implicit constraint mutating hook to support multiple Vault and Consul clusters in Nomad Enterprise. This requires moving the Vault/Consul mutating hooks earlier in the list as well, because that'll ensure we've canonicalized properly for multiple clusters.	2023-09-19 12:19:44 -04:00
Daniel Bennett	4895d708b4	csi: implement NodeExpandVolume (#18522 ) following ControllerExpandVolume in `c6dbba7cde`, which expands the disk at e.g. a cloud vendor, the controller plugin may say that we also need to issue NodeExpandVolume for the node plugin to make the new disk space available to task(s) that have claims on the volume by e.g. expanding the filesystem on the node. csi spec: https://github.com/container-storage-interface/spec/blob/c918b7f/spec.md#nodeexpandvolume	2023-09-18 10:30:15 -05:00
dependabot[bot]	d564d7811b	chore(website/content): update content-conformance version (#17482 )	2023-09-18 11:08:51 -04:00
Seth Hoenig	591394fb62	drivers: plumb hardware topology via grpc into drivers (#18504 ) * drivers: plumb hardware topology via grpc into drivers This PR swaps out the temporary use of detecting system hardware manually in each driver for using the Client's detected topology by plumbing the data over gRPC. This ensures that Client configuration is taken to account consistently in all references to system topology. * cr: use enum instead of bool for core grade * cr: fix test slit tables to be possible	2023-09-18 08:58:07 -05:00
Tim Gross	b105e41265	job endpoint: reorder check for disabled job registrations (#18523 ) When job registrations are disabled, there's no reason to do the potentially expensive job mutation and admission hooks. Move the ACL resolution and this check before those hooks.	2023-09-18 09:15:02 -04:00
Tim Gross	5bd8b89c19	helper: reduce size of buffer used by template connections (#18524 ) In #12458 we added an in-memory connection buffer so that template runners that want access to the Nomad API for Service Registration and Variables can communicate with Nomad without having to create a real HTTP client. The size of this buffer (1 MiB) was taken directly from its usage in Vault, and each connection makes 2 such buffers (send and receive). Because each template runner has its own connection, when there are large numbers of allocations this adds up to significant memory usage. The largest Nomad Variable payload is 64KiB, and a small amount of metadata. Service Registration responses are much smaller, and we don't include check results in them (as Consul does), so the size is relatively bounded. We should be able to safely reduce the size of the buffer by a factor of 10 or more without forcing the template runner to make multiple read calls over the buffer. Fixes: #18508	2023-09-18 09:12:09 -04:00
Tim Gross	ad4436ffff	job endpoint hooks to enforce access to vault/consul clusters (CE) (#18521 ) In Nomad Enterprise, namespace rules can control access to Vault and Consul clusters. Add job endpoint mutating and validating hooks for both Vault and Consul so that ENT can enforce these namespace rules. This changeset includes the stub behaviors for CE. Ref: https://github.com/hashicorp/nomad-enterprise/pull/1234	2023-09-15 13:58:37 -04:00
Shantanu Gadgil	f37f84182d	docs: example of multiple crons (#18511 )	2023-09-15 10:10:56 -04:00
Gerard Nguyen	1339599185	cli: Add prune flag for nomad server force-leave command (#18463 ) This feature will help operator to remove a failed/left node from Serf layer immediately without waiting for 24 hours for the node to be reaped * Update CLI with prune flag * Update API /v1/agent/force-leave with prune query string parameter * Update CLI and API doc * Add unit test	2023-09-15 08:45:11 -04:00
Shantanu Gadgil	d2dd64f2c4	point to hashicorp's cronexpr (#18510 ) point to hashicorp's cronexpr	2023-09-15 09:23:58 +01:00
Luiz Aoqui	5f951d506a	docs: update Vault config for workload identity (#18503 ) Update documentation for the agent configuration `vault` block for workload identity support.	2023-09-14 19:38:36 -03:00
Daniel Bennett	c6dbba7cde	csi: implement ControllerExpandVolume (#18359 ) the first half of volume expansion, this allows a user to update requested capacity ("capacity_min" and "capacity_max") in a volume specification file, and re-issue either Register or Create volume commands (or api calls). the requested capacity will now be "reconciled" with the current real capacity of the volume, issuing a ControllerExpandVolume RPC call to a running controller plugin, if requested "capacity_min" is higher than the current capacity on the volume in state. csi spec: https://github.com/container-storage-interface/spec/blob/c918b7f/spec.md#controllerexpandvolume note: this does not yet cover NodeExpandVolume	2023-09-14 14:13:04 -05:00
wrli20	0329393a28	docs: fix link to alicloud autoscaler plugin (#18495 )	2023-09-14 09:23:58 -04:00
stswidwinski	bd519dcbf4	Fix for https://github.com/hashicorp/nomad/issues/18493 (#18494 ) Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-09-14 13:35:15 +01:00

1 2 3 4 5 ...

25110 Commits