nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-02 00:15:43 +03:00

Author	SHA1	Message	Date
Jorge Marey	5f78940911	Allow setting a token name template on auth methods (#19135 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2023-11-28 12:26:21 +00:00
codenoid	557b4942d0	api: fix panic in Allocation.Stub() when Job is nil (#19115 )	2023-11-17 08:55:46 -05:00
Seth Hoenig	3ba364e42f	deps: update some dependencies (#19002 ) * deps: update shoenig/test to 1.7.0 * deps: update go-set/v2 to v2.1.0 * deps: update shoenig/go-landlock to v1.2.0	2023-11-07 07:34:40 -06:00
Michael Schurter	e49ca3c431	identity: Implement `change_mode` (#18943 ) * identity: support change_mode and change_signal wip - just jobspec portion * test struct * cleanup some insignificant boogs * actually implement change mode * docs tweaks * add changelog * test identity.change_mode operations * use more words in changelog * job endpoint tests * address comments from code review --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-11-01 09:41:11 -05:00
Luiz Aoqui	d7edbd44b7	api: handle redirect during websocket upgrade (#18903 ) When attempting a WebSocket connection upgrade the client may receive a redirect request from the server, in which case the request should be reattempted using the new address present in the `Location` header.	2023-10-31 17:12:11 -04:00
Luiz Aoqui	3ddf1ecf1d	actions: minor bug fixes and improvements (#18904 )	2023-10-31 17:06:02 -04:00
Michael Schurter	66fbc0f67e	identity: default to RS256 for new workload ids (#18882 ) OIDC mandates the support of the RS256 signing algorithm so in order to maximize workload identity's usefulness this change switches from using the EdDSA signing algorithm to RS256. Old keys will continue to use EdDSA but new keys will use RS256. The EdDSA generation code was left in place because it's fast and cheap and I'm not going to lie I hope we get to use it again. Test Updates Most of our Variables and Keyring tests had a subtle assumption in them that the keyring would be initialized by the time the test server had elected a leader. ed25519 key generation is so fast that the fact that it was happening asynchronously with server startup didn't seem to cause problems. Sadly rsa key generation is so slow that basically all of these tests failed. I added a new `testutil.WaitForKeyring` helper to replace `testutil.WaitForLeader` in cases where the keyring must be initialized before the test may continue. However this is mostly used in the `nomad/` package. In the `api` and `command/agent` packages I decided to switch their helpers to wait for keyring initialization by default. This will slow down tests a bit, but allow those packages to not be as concerned with subtle server readiness details. On my machine rsa key generation takes 63ms, so hopefully the difference isn't significant on CI runners. TODO - Docs and changelog entries. - Upgrades - right now upgrades won't get RS256 keys until their root key rotates either manually or after ~30 days. - Observability - I'm not sure there's a way for operators to see if they're using EdDSA or RS256 unless they inspect a key. The JWKS endpoint can be inspected to see if EdDSA will be used for new identities, but it doesn't technically define which key is active. If upgrades can be fixed to automatically rotate keys, we probably don't need to worry about this. Requiem for ed25519 When workload identities were first implemented we did not immediately consider OIDC compliance. Consul, Vault, and many other third parties support JWT auth methods without full OIDC compliance. For the machine<-->machine use cases workload identity is intended to fulfill, OIDC seemed like a bigger risk than asset. EdDSA/ed25519 is the signing algorithm we chose for workload identity JWTs because of all these lovely properties: 1. Deterministic keys that can be derived from our preexisting root keys. This was perhaps the biggest factor since we already had a root encryption key around from which we could derive a signing key. 2. Wonderfully compact: 64 byte private key, 32 byte public key, 64 byte signatures. Just glorious. 3. No parameters. No choices of encodings. It's all well-defined by [RFC 8032](https://datatracker.ietf.org/doc/html/rfc8032). 4. Fastest performing signing algorithm! We don't even care that much about the performance of our chosen algorithm, but what a free bonus! 5. Arguably one of the most secure signing algorithms widely available. Not just from a cryptanalysis perspective, but from an API and usage perspective too. Life was good with ed25519, but sadly it could not last. [IDPs](https://en.wikipedia.org/wiki/Identity_provider), such as AWS's IAM OIDC Provider, love OIDC. They have OIDC implemented for humans, so why not reuse that OIDC support for machines as well? Since OIDC mandates RS256, many implementations don't bother implementing other signing algorithms (or at least not advertising their support). A quick survey of OIDC Discovery endpoints revealed only 2 out of 10 OIDC providers advertised support for anything other than RS256: - [PayPal](https://www.paypalobjects.com/.well-known/openid-configuration) supports HS256 - [Yahoo](https://api.login.yahoo.com/.well-known/openid-configuration) supports ES256 RS256 only: - [GitHub](https://token.actions.githubusercontent.com/.well-known/openid-configuration) - [GitLab](https://gitlab.com/.well-known/openid-configuration) - [Google](https://accounts.google.com/.well-known/openid-configuration) - [Intuit](https://developer.api.intuit.com/.well-known/openid_configuration) - [Microsoft](https://login.microsoftonline.com/fabrikamb2c.onmicrosoft.com/v2.0/.well-known/openid-configuration) - [SalesForce](https://login.salesforce.com/.well-known/openid-configuration) - [SimpleLogin (acquired by ProtonMail)](https://app.simplelogin.io/.well-known/openid-configuration/) - [TFC](https://app.terraform.io/.well-known/openid-configuration)	2023-10-31 11:25:20 -07:00
Phil Renaud	8902afe651	Nomad Actions (#18794 ) * Scaffolding actions (#18639) * Task-level actions for job submissions and retrieval * FIXME: Temporary workaround to get ember dev server to pass exec through to 4646 * Update api/tasks.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * Update command/agent/job_endpoint.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * Diff and copy implementations * Action structs get their own file, diff updates to behave like our other diffs * Test to observe actions changes in a version update * Tests migrated into structs/diff_test and modified with PR comments in mind * APIActionToSTructsAction now returns a new value * de-comment some plain parts, remove unused action lookup * unused param in action converter --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> * New endpoint: job/:id/actions (#18690) * unused param in action converter * backing out of parse_job level and moved toward new endpoint level * Adds taskName and taskGroupName to actions at job level * Unmodified job mock actions tests * actionless job test * actionless job test * Multi group multi task actions test * HTTP method check for GET, cleaner errors in job_endpoint_test * decomment * Actions aggregated at job model level (#18733) * Removal of temporary fix to proxy to 4646 * Run Action websocket endpoint (#18760) * Working demo for review purposes * removal of cors passthru for websockets * Remove job_endpoint-specific ws handlers and aimed at existing alloc exec handlers instead * PR comments adressed, no need for taskGroup pass, better group and task lookups from alloc * early return in action validate and removed jobid from req args per PR comments * todo removal, we're checking later in the rpc * boolean style change on tty * Action CLI command (#18778) * Action command init and stuck-notes * Conditional reqpath to aim at Job action endpoint * De-logged * General CLI command cleanup, observe namespace, pass action as string, get random alloc w group adherence * tab and varname cleanup * Remove action param from Allocations().Exec calls * changelog * dont nil-check acl --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-10-20 13:05:55 -04:00
Seth Hoenig	83720740f5	core: plumbing to support numa aware scheduling (#18681 ) * core: plumbing to support numa aware scheduling * core: apply node resources compatibility upon fsm rstore Handle the case where an upgraded server dequeus an evaluation before a client triggers a new fingerprint - which would be needed to cause the compatibility fix to run. By running the compat fix on restore the server will immediately have the compatible pseudo topology to use. * lint: learn how to spell pseudo	2023-10-19 15:09:30 -05:00
modrake	51ffe4208e	workaround and fixes for MPL and copywrite bot (#18775 )	2023-10-17 08:02:13 +01:00
Tim Gross	cbd7248248	auth: use `ACLsDisabledACL` when ACLs are disabled (#18754 ) The RPC handlers expect to see `nil` ACL objects whenever ACLs are disabled. By using `nil` as a sentinel value, we have the risk of nil pointer exceptions and improper handling of `nil` when returned from our various auth methods that can lead to privilege escalation bugs. This is the final patch in a series to eliminate the use of `nil` ACLs as a sentinel value for when ACLs are disabled. This patch adds a new virtual ACL policy field for when ACLs are disabled and updates our authentication logic to use it. Included: * Extends auth package tests to demonstrate that nil ACLs are treated as failed auth and disabled ACLs succeed auth. * Adds a new `AllowDebug` ACL check for the weird special casing we have for pprof debugging when ACLs are disabled. * Removes the remaining unexported methods (and repeated tests) from the `nomad/acl.go` file. * Update the semgrep rules to detect improper nil ACL checking and remove the old invalid ACL checks. * Update the contributing guide for RPC authentication. Ref: https://github.com/hashicorp/nomad-enterprise/pull/1218 Ref: https://github.com/hashicorp/nomad/pull/18703 Ref: https://github.com/hashicorp/nomad/pull/18715 Ref: https://github.com/hashicorp/nomad/pull/16799 Ref: https://github.com/hashicorp/nomad/pull/18730 Ref: https://github.com/hashicorp/nomad/pull/18744	2023-10-16 09:30:24 -04:00
Tim Gross	b39632fa6f	testing: fix configuration for retry tests (#18731 ) The retry tests in the `api` package set up a client but don't use `NewClient`, so the address never gets parsed into a `url.URL` and that's causing some test failures.	2023-10-11 14:06:31 -04:00
Charlie Voiselle	7266d267b0	Add unix domain socket support to API (#16872 ) - Expose internal HTTP client's Do() via Raw - Use URL parser to identify scheme - Align more with curl output - Add changelog - Fix test failure; add tests for socket envvars - Apply review feedback for tests - Consolidate address parsing - Address feedback from code reviews Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-10-11 11:04:12 -04:00
Charlie Voiselle	8a93ff3d2d	[server] Directed leadership transfer CLI and API (#17383 ) * Add directed leadership transfer func * Add leadership transfer RPC endpoint * Add ACL tests for leadership-transfer endpoint * Add HTTP API route and implementation * Add to Go API client * Implement CLI command * Add documentation * Add changelog Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-10-04 12:20:27 -04:00
Tim Gross	aaee3076c2	consul: allow `consul` block in task scope (#18597 ) To support Workload Identity with Consul for templates, we want templates to be able to use the WI created at the task scope (either implicitly or set by the user). But to allow different tasks within a group to be assigned to different clusters as we're doing for Vault, we need to be able to set the `consul` block with its `cluster` field at the task level to override the group.	2023-09-29 15:03:48 -04:00
Juana De La Cuesta	72acaf6623	[17449] Introduces a locking mechanism over variables (#18207 ) It includes the work over the state store, the PRC server, the HTTP server, the go API package and the CLI's command. To read more on the actuall functionality, refer to the RFCs [NMD-178] Locking with Nomad Variables and [NMD-179] Leader election using locking mechanism for the Autoscaler.	2023-09-21 17:56:33 +02:00
Gerard Nguyen	1339599185	cli: Add prune flag for nomad server force-leave command (#18463 ) This feature will help operator to remove a failed/left node from Serf layer immediately without waiting for 24 hours for the node to be reaped * Update CLI with prune flag * Update API /v1/agent/force-leave with prune query string parameter * Update CLI and API doc * Add unit test	2023-09-15 08:45:11 -04:00
Pavel Aminov	5ddada2973	Adding node_pool to job key validation (#18366 )	2023-09-13 11:52:04 -03:00
James Rasell	d923fc554d	consul/connect: add new fields to Consul Connect upstream block (#18430 ) Co-authored-by: Horacio Monsalvo <horacio.monsalvo@southworks.com>	2023-09-11 16:02:52 +01:00
Michael Schurter	ef24e40b39	identity: support jwt expiration and rotation (#18262 ) Implements expirations and renewals for alternate workload identity tokens.	2023-09-08 14:50:34 -07:00
Tim Gross	3ee6c31241	ACLs: allow/deny/default config for Consul/Vault clusters by namespace (#18425 ) In Nomad Enterprise when multiple Vault/Consul clusters are configured, cluster admins can control access to clusters for jobs via namespace ACLs, similar to how we've done so for node pools. This changeset updates the ACL configuration structs, but doesn't wire them up.	2023-09-08 11:37:20 -04:00
Tim Gross	7cdd592809	jobspec: support `cluster` field for Vault block (#18408 ) This field supports the upcoming ENT-only multiple Vault clusters feature. The job validation and mutation hooks will come in a separate PR. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-09-07 10:15:28 -04:00
Tim Gross	7863d7bcbb	jobspec: support `cluster` field for Consul and Service blocks (#18409 ) This field supports the upcoming ENT-only multiple Consul clusters feature. The job validation and mutation hooks will come in a separate PR. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-09-07 09:48:49 -04:00
Dao Thanh Tung	6ba600cbf1	Add unit test for api/deployments.go (#18380 )	2023-09-05 07:44:54 +01:00
Luiz Aoqui	7466496608	config: fix identity config for Consul service (#18363 ) Rename the agent configuraion for workload identity to `WorkloadIdentityConfig` to make its use more explicit and remove the `ServiceName` field since it is never expected to be defined in a configuration file. Also update the job mutation to inject a service identity following these rules: 1. Don't inject identity if `consul.use_identity` is false. 2. Don't inject identity if `consul.service_identity` is not specified. 3. Don't inject identity if service provider is not `consul`. 4. Set name and service name if the service specifies an identity. 5. Inject `consul.service_identity` if service does not specify an identity.	2023-08-31 11:22:48 -03:00
Seth Hoenig	f5b0da1d55	all: swap exp packages for maps, slices (#18311 )	2023-08-23 15:42:13 -05:00
Андрей Неустроев	3e61b3a37d	Add multiple times in periodic jobs (#17858 )	2023-08-22 15:42:31 -04:00
Lance Haig	0b9cf4e7b7	Deprecate the Original Bootstrap Token Code (#17792 )	2023-08-22 08:06:15 +01:00
Piotr Kazmierczak	9fa39eb829	jobspec: add nomad_service field and identity block (#18239 ) This PR introduces updates to the jobspec required for workload identity support for services. --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-08-21 20:07:47 +02:00
Luiz Aoqui	196213c451	jobspec: add `role` to `vault` (#18257 )	2023-08-18 15:29:02 -04:00
Michael Schurter	0e22fc1a0b	identity: add support for multiple identities + audiences (#18123 ) Allows for multiple `identity{}` blocks for tasks along with user-specified audiences. This is a building block to allow workload identities to be used with Consul, Vault and 3rd party JWT based auth methods. Expiration is still unimplemented and is necessary for JWTs to be used securely, so that's up next. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-08-15 09:11:53 -07:00
dependabot[bot]	3c7a44daea	build(deps): bump github.com/shoenig/test from 0.6.6 to 0.6.7 in /api (#18191 ) Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 0.6.6 to 0.6.7. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v0.6.6...v0.6.7) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-08-14 09:21:29 -05:00
hashicorp-copywrite[bot]	89e24d7405	Adding explicit MPL license for sub-package This directory and its subdirectories (packages) contain files licensed with the MPLv2 `LICENSE` file in this directory and are intentionally licensed separately from the BSL `LICENSE` file at the root of this repository.	2023-08-10 17:27:01 -05:00
Charlie Voiselle	585b0533c0	[dep] bump golang.org/x/exp (#18102 ) There are some refactorings that have to be made in the getter and state where the api changed in `slices` * Bump golang.org/x/exp * Bump golang.org/x/exp in api * Update job_endpoint_test * [feedback] unexport sort function	2023-08-01 11:50:17 -04:00
Gerard Nguyen	9e98d694a6	feature: Add new field render_templates on restart block (#18054 ) This feature is necessary when user want to explicitly re-render all templates on task restart. E.g. to fetch all new secrets from Vault, even if the lease on the existing secrets has not been expired.	2023-07-28 11:53:32 -07:00
dependabot[bot]	e8683e3f49	build(deps): bump github.com/hashicorp/cronexpr in /api (#17787 )	2023-07-10 11:23:00 +01:00
deverton-godaddy	e75ae1de96	[api] Add NetworkStatus to allocation response (#17280 ) Service discovery or mesh network systems consuming the Nomad event stream or API need to know the CNI assigned IP for the allocation. This data is returned by the underlying Nomad API but isn't mapped in the response struct.	2023-07-04 19:35:38 -04:00
sejalapeno	05c84d64d2	Update allocations.go (#17726 ) * Update allocations.go updated missing client status "unknown" #17688 * changelog * Update .changelog/17726.txt adding relevant desc. Co-authored-by: Seth Hoenig <shoenig@duck.com> --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-06-26 13:33:29 -05:00
Luiz Aoqui	276c69bffd	api: prevent panic on job plan (#17689 ) Check for a nil job ID to prevent a panic when calling Jobs().Plan().	2023-06-23 16:20:52 -04:00
grembo	6f04b91912	Add `disable_file` parameter to job's `vault` stanza (#13343 ) This complements the `env` parameter, so that the operator can author tasks that don't share their Vault token with the workload when using `image` filesystem isolation. As a result, more powerful tokens can be used in a job definition, allowing it to use template stanzas to issue all kinds of secrets (database secrets, Vault tokens with very specific policies, etc.), without sharing that issuing power with the task itself. This is accomplished by creating a directory called `private` within the task's working directory, which shares many properties of the `secrets` directory (tmpfs where possible, not accessible by `nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the container. If the `disable_file` parameter is set to `false` (its default), the Vault token is also written to the NOMAD_SECRETS_DIR, so the default behavior is backwards compatible. Even if the operator never changes the default, they will still benefit from the improved behavior of Nomad never reading the token back in from that - potentially altered - location.	2023-06-23 15:15:04 -04:00
Luiz Aoqui	6c64847e1b	np: scheduler configuration updates (#17575 ) * jobspec: rename node pool scheduler_configuration In HCL specifications we usually call configuration blocks `config` instead of `configuration`. * np: add memory oversubscription config * np: make scheduler config ENT	2023-06-19 11:41:46 -04:00
Luiz Aoqui	4f7c38b2a7	node pools: namespace integration (#17562 ) Add structs and fields to support the Nomad Pools Governance Enterprise feature of controlling node pool access via namespaces. Nomad Enterprise allows users to specify a default node pool to be used by jobs that don't specify one. In order to accomplish this, it's necessary to distinguish between a job that explicitly uses the `default` node pool and one that did not specify any. If the `default` node pool is set during job canonicalization it's impossible to do this, so this commit allows a job to have an empty node pool value during registration but sets to `default` at the admission controller mutator. In order to guarantee state consistency the state store validates that the job node pool is set and exists before inserting it.	2023-06-16 16:30:22 -04:00
Tim Gross	9a6078a2ae	node pools: implement support in scheduler (#17443 ) Implement scheduler support for node pool: * When a scheduler is invoked, we get a set of the ready nodes in the DCs that are allowed for that job. Extend the filter to include the node pool. * Ensure that changes to a job's node pool are picked up as destructive allocation updates. * Add `NodesInPool` as a metric to all reporting done by the scheduler. * Add the node-in-pool the filter to the `Node.Register` RPC so that we don't generate spurious evals for nodes in the wrong pool.	2023-06-07 10:39:03 -04:00
Luiz Aoqui	354d741c95	node pool: implement `nomad node pool nodes` CLI (#17444 )	2023-06-07 10:37:27 -04:00
Tim Gross	385dbfb8d1	node pools: implement HTTP API to list jobs in pool (#17431 ) Implements the HTTP API associated with the `NodePool.ListJobs` RPC, including the `api` package for the public API and documentation. Update the `NodePool.ListJobs` RPC to fix the missing handling of the special "all" pool.	2023-06-06 11:40:13 -04:00
Luiz Aoqui	637ddf516e	node pools: add event stream support (#17412 )	2023-06-06 10:14:47 -04:00
Luiz Aoqui	81f0b359dd	node pools: register a node in a node pool (#17405 )	2023-06-02 17:50:50 -04:00
Luiz Aoqui	c09ca1e765	node pools: implement CLI (#17388 )	2023-06-02 15:49:57 -04:00
Samantha	7ef1905333	check: Add support for Consul field tls_server_name (#17334 )	2023-06-02 10:19:12 -04:00
Tim Gross	2d059bbf22	node pools: add `node_pool` field to job spec (#17379 ) This changeset only adds the `node_pool` field to the jobspec, and ensures that it gets picked up correctly as a change. Without the rest of the implementation landed yet, the field will be ignored.	2023-06-01 16:08:55 -04:00

1 2 3 4 5 ...

1216 Commits