nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 00:45:43 +03:00

Author	SHA1	Message	Date
Tim Gross	a1ede9765c	docs: warn about UID overlap between workload and Envoy tproxy (#24291 ) When using transparent proxy mode with the `connect` block, the UID of the workload cannot be the same as the UID of the Envoy sidecar (currently 101 in the default Envoy container image). Fixes: https://github.com/hashicorp/nomad/issues/23508	2024-10-24 08:45:44 -04:00
R.B. Boyer	4e8f596311	docs: update broken consul acl token links (#24287 )	2024-10-23 13:34:21 -04:00
Tim Gross	10358cc911	docs: warn about Consul auth method locality (#24275 ) * docs: warn about Consul auth method locality The locality of Consul tokens we mint via Workload Identity is governed by the Consul auth method configuration. By default tokens are local to the Consul datacenter, which typically maps 1:1 with a Nomad region. Cluster administrators who need cross-datacenter tokens can get them by setting the locality to global, at the risk of placement problems if the primary DC isn't available. Ref: https://github.com/hashicorp/consul/issues/21863 Fixes: https://github.com/hashicorp/nomad/issues/23505	2024-10-23 11:44:03 -04:00
Tim Gross	7381f8419b	docs: clarify requirements for Consul token policies and TTLs (#24167 ) As of #24166, Nomad agents will use their own token to deregister services and checks from Consul. This returns the deregistration path to the pre-Workload Identity workflow. Expand the documentation to make clear why certain ACL policies are required for clients. Additionally, we did not explicitly call out that auth methods should not set an expiration on Consul tokens. Nomad does not have a facility to refresh these tokens if they expire. Even if Nomad could, there's no way to re-inject them into Envoy sidecars for Consul Service Mesh without recreating the task anyways, which is what happens today. Warn users that they should not set an expiration. Closes: https://github.com/hashicorp/nomad/issues/20185 (wontfix) Ref: https://hashicorp.atlassian.net/browse/NET-10262	2024-10-11 11:59:21 -04:00
Piotr Kazmierczak	c1362c03df	docs: minimal Consul policy for Nomad agents needs node:write (#23800 )	2024-08-13 17:53:21 +02:00
Aimee Ukasick	021692eccf	docs: refactor CNI plugin content (#23707 ) - Pulled common content from multiple pages into new partials - Refactored install/index to be OS-based so I could add linux-distro-based instructions to install-consul-cni-plugins.mdx partial. The tab groups on the install/index page do match and change focus as expected. - Moved CNI overview-type content to networking/index - Refactored networking/cni to include install CNI plugins and configuration content (from install/index). - Moved CNI plugins explanation in bridge mode configuration section into bullet points. They had been #### headings, which aren't rendered in the R page TOC. I tried to simplify and format the bullet point content to be easier to scan. Ref: https://hashicorp.atlassian.net/browse/CE-661 Fixes: https://github.com/hashicorp/nomad/issues/23229 Fixes: https://github.com/hashicorp/nomad/issues/23583	2024-08-06 14:47:46 -04:00
Tim Gross	d5ca07a247	docs: notices of upcoming deprecations and backports (#23683 ) Add a section to the docs describing planned upcoming deprecations and removals. Also added some missing upgrade guide sections missed during the last release.	2024-07-25 10:20:18 -04:00
Tim Gross	2f4353412d	keyring: support prepublishing keys (#23577 ) When a root key is rotated, the servers immediately start signing Workload Identities with the new active key. But workloads may be using those WI tokens to sign into external services, which may not have had time to fetch the new public key and which might try to fetch new keys as needed. Add support for prepublishing keys. Prepublished keys will be visible in the JWKS endpoint but will not be used for signing or encryption until their `PublishTime`. Update the periodic key rotation to prepublish keys at half the `root_key_rotation_threshold` window, and promote prepublished keys to active after the `PublishTime`. This changeset also fixes two bugs in periodic root key rotation and garbage collection, both of which can't be safely fixed without implementing prepublishing: * Periodic root key rotation would never happen because the default `root_key_rotation_threshold` of 720h exceeds the 72h maximum window of the FSM time table. We now compare the `CreateTime` against the wall clock time instead of the time table. (We expect to remove the time table in future work, ref https://github.com/hashicorp/nomad/issues/16359) * Root key garbage collection could GC keys that were used to sign identities. We now wait until `root_key_rotation_threshold` + `root_key_gc_threshold` before GC'ing a key. * When rekeying a root key, the core job did not mark the key as inactive after the rekey was complete. Ref: https://hashicorp.atlassian.net/browse/NET-10398 Ref: https://hashicorp.atlassian.net/browse/NET-10280 Fixes: https://github.com/hashicorp/nomad/issues/19669 Fixes: https://github.com/hashicorp/nomad/issues/23528 Fixes: https://github.com/hashicorp/nomad/issues/19368	2024-07-19 13:29:41 -04:00
Tim Gross	df67e74615	Consul: add preflight checks for Envoy bootstrap (#23381 ) Nomad creates Consul ACL tokens and service registrations to support Consul service mesh workloads, before bootstrapping the Envoy proxy. Nomad always talks to the local Consul agent and never directly to the Consul servers. But the local Consul agent talks to the Consul servers in stale consistency mode to reduce load on the servers. This can result in the Nomad client making the Envoy bootstrap request with a tokens or services that have not yet replicated to the follower that the local client is connected to. This request gets a 404 on the ACL token and that negative entry gets cached, preventing any retries from succeeding. To workaround this, we'll use a method described by our friends over on `consul-k8s` where after creating the objects in Consul we try to read them from the local agent in stale consistency mode (which prevents a failed read from being cached). This cannot completely eliminate this source of error because it's possible that Consul cluster replication is unhealthy at the time we need it, but this should make Envoy bootstrap significantly more robust. This changset adds preflight checks for the objects we create in Consul: * We add a preflight check for ACL tokens after we login via via Workload Identity and in the function we use to derive tokens in the legacy workflow. We do this check early because we also want to use this token for registering group services in the allocrunner hooks. * We add a preflight check for services right before we bootstrap Envoy in the taskrunner hook, so that we have time for our service client to batch updates to the local Consul agent in addition to the local agent sync. We've added the timeouts to be configurable via node metadata rather than the usual static configuration because for most cases, users should not need to touch or even know these values are configurable; the configuration is mostly available for testing. Fixes: https://github.com/hashicorp/nomad/issues/9307 Fixes: https://github.com/hashicorp/nomad/issues/10451 Fixes: https://github.com/hashicorp/nomad/issues/20516 Ref: https://github.com/hashicorp/consul-k8s/pull/887 Ref: https://hashicorp.atlassian.net/browse/NET-10051 Ref: https://hashicorp.atlassian.net/browse/NET-9273 Follow-up: https://hashicorp.atlassian.net/browse/NET-10138	2024-06-27 10:15:37 -04:00
David Yu	92af6280e3	Update service-mesh.mdx	2024-06-13 20:09:53 -07:00
David Yu	94bb91ab80	docs - release notes updates (#23312 ) Also updated Consul compatibility matrix	2024-06-13 13:46:42 -04:00
Michael Schurter	a3b1810bdb	doc: specify ca cert needs to be shared (#20620 ) Specify that the Vault JWT auth method must be configured to trust Nomad's CA certificate when mTLS is enabled.	2024-05-17 14:49:48 -07:00
Tim Gross	1739f94e84	docs: fix a broken link on the Consul index page (#20387 )	2024-04-12 15:31:48 -04:00
Tim Gross	43281f6038	docs: provide guidance on using Consul DNS (#20369 ) Add a standalone section to the Consul integration docs showing how to configure both the Consul agent and the workload to take advantage of Consul DNS. Include a reference to the new transparent proxy feature as well. Fixes: https://github.com/hashicorp/nomad/issues/18305	2024-04-12 14:38:04 -04:00
Tim Gross	9340c77b12	docs: remove extra indents in tproxy HCL examples	2024-04-10 10:21:32 -04:00
Tim Gross	e2e561da88	tproxy: documentation improvements	2024-04-10 08:55:50 -04:00
Tim Gross	bb062deadc	docs: update service mesh integration docs for transparent proxy (#20251 ) Update the service mesh integration docs to explain how Consul needs to be configured for transparent proxy. Update the walkthrough to assume that `transparent_proxy` mode is the best approach, and move the manually-configured `upstreams` to a separate section for users who don't want to use Consul DNS. Ref: https://github.com/hashicorp/nomad/pull/20175 Ref: https://github.com/hashicorp/nomad/pull/20241	2024-04-04 17:01:07 -04:00
Tim Gross	9c2286014f	docs: update Consul compatibility matrix (#20242 ) Version of Nomad and Consul that were known not to be compatible are no longer supported in general. Update the compatibility matrix for Consul to match.	2024-03-27 16:11:14 -04:00
Luiz Aoqui	e1e80f383e	vault: add new `nomad setup vault -check` commmand (#19720 ) The new `nomad setup vault -check` commmand can be used to retrieve information about the changes required before a cluster is migrated from the deprecated legacy authentication flow with Vault to use only workload identities.	2024-01-12 15:48:30 -05:00
Luiz Aoqui	b2aa6ffd05	docs: fix Consul ACL requirements (#19721 ) Even with the new workload identitiy based flow the Nomad servers still need the `acl = "write"` permission in order to revoke service identity tokens.	2024-01-11 15:52:23 -05:00
Tim Gross	0935f443dc	vault: support allowing tokens to expire without refresh (#19691 ) Some users with batch workloads or short-lived prestart tasks want to derive a Vaul token, use it, and then allow it to expire without requiring a constant refresh. Add the `vault.allow_token_expiration` field, which works only with the Workload Identity workflow and not the legacy workflow. When set to true, this disables the client's renewal loop in the `vault_hook`. When Vault revokes the token lease, the token will no longer be valid. The client will also now automatically detect if the Vault auth configuration does not allow renewals and will disable the renewal loop automatically. Note this should only be used when a secret is requested from Vault once at the start of a task or in a short-lived prestart task. Long-running tasks should never set `allow_token_expiration=true` if they obtain Vault secrets via `template` blocks, as the Vault token will expire and the template runner will continue to make failing requests to Vault until the `vault_retry` attempts are exhausted. Fixes: https://github.com/hashicorp/nomad/issues/8690	2024-01-10 14:49:02 -05:00
Luiz Aoqui	5267eec3ad	vault: fix token revocation during workflow migration (#19689 ) When transitioning from the legacy token-based workflow to the new JWT workflow for Vault the previous code would instantiate a no-op Vault if the server configuration had a `default_identity` block. This no-op client returned an error for some of its operations were called, such as `LookupToken` and `RevokeTokens`. The original intention was that, in the new JWT workflow, none of these methods should be called, so returning an error could help surface potential bugs. But the `RevokeTokens` and `MarkForRevocation` methods _are_ called even in the JWT flow. When a leadership transition happens, the new server looks for unused Vault accessors from state and tries to revoke them. Similarly, the `RevokeTokens` method is called every time the `Node.UpdataStatus` and `Node.UpdateAlloc` RPCs are made by clients, as the Nomad server tries to find unused Vault tokens for the node/alloc. Since the new JWT flow does not require Nomad servers to contact Vault, calling `RevokeTokens` and `MarkForRevocation` is not able to complete without a Vault token, so this commit changes the logic to use the no-op Vault client when no token is configured. It also updates the client itself to not error if these methods are called, but to rather just log so operators can be made aware that there are Vault tokens created by Nomad that have not been force-expired. When migrating an existing cluster to the new workload identity based flow, Nomad operators must first upgrade the Nomad version without removing any of the existing Vault configuration. Doing so can prevent Nomad servers from managing and cleaning-up existing Vault tokens during a leadership transition and node or alloc updates. Operators must also resubmit all jobs with a `vault` block so they are updated with an `identity` for Vault. Skipping this step may cause allocations to fail if their Vault token expires (if, for example, the Nomad client stops running for TTL/2) or if they are rescheduled, since the new client will try to follow the legacy flow which will fail if the Nomad server configuration for Vault has already been updated to remove the Vault address and token.	2024-01-10 13:28:46 -05:00
Luiz Aoqui	a8d1447550	docs: update Consul and Vault integration (#19424 )	2023-12-14 15:14:55 -05:00
Tim Gross	37df614da6	docs: fix recommended binding rules for Consul integration (#19299 ) Fixes some errors in the documentation for the Consul integration, based on tests locally without using the `nomad setup consul` command and updating the docs to match. * Consul CE doesn't support the `-namespace-rule-bind-namespace` option. * The binding rule for services should not including the Nomad namespace in the `bind-name` parameter (the service is registered in the appropriate Consul namespace). * The role for tasks should include the suffix "-tasks" in the name to match the binding rule we create. * Fix the Consul bound audiences to be a list of strings * Fix some quoting issues in the commands.	2023-12-04 11:56:03 -05:00
Piotr Kazmierczak	e57dcdf106	docs: adjust claim mappings for Consul auth method (#19244 )	2023-11-30 20:01:18 +01:00
Piotr Kazmierczak	d699b82df6	docs: update consul-integration to include ns changes (#19239 ) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-11-30 16:37:48 +01:00
Piotr Kazmierczak	26b778bb0c	docs: correction to Consul integration TLS note (#19207 )	2023-11-28 19:22:02 +01:00
Piotr Kazmierczak	248b2ba5cd	WI: use single auth method for Consul by default (#19169 ) This simplifies the default setup of Nomad workloads WI-based authentication for Consul by using a single auth method with 2 binding rules. Users can still specify separate auth methods for services and tasks.	2023-11-28 12:22:27 +01:00
Piotr Kazmierczak	3b701ee0cf	docs: additional note about JWKS endpoints and CA certs (#19144 )	2023-11-27 17:34:44 +01:00
Tim Gross	2bff6d2a6a	docs: fix `token_period` in example Vault role for WI (#18939 ) Vault tokens requested for WI are "periodic" Vault tokens (ones that get periodically renewed). The field we should be setting for the renewal window is `token_period`.	2023-10-31 16:33:03 -04:00
Michael Schurter	9afc70ef5a	Fix Vault docs to use HCL instead of JSON (#18938 )	2023-10-31 13:25:20 -07:00
Tim Gross	47f2118f40	docs: Vault Workload Identity integration (#18704 ) Documentation updates to support the new Vault integration with Nomad Workload Identity. Included: * Added a large section to the Vault integration docs to explain how to set up auth methods, roles, and policies (by hand, assuming we don't ship a `nomad setup-vault` tool for now), and how to safely migrate from the existing workflow to the new one. * Shuffled around some of the existing text so that the legacy authentication method text is in its own section. * Added a compatibility matrix to the Vault integration page.	2023-10-26 10:33:52 -04:00
Piotr Kazmierczak	7f62dec473	consul WI: rename default auth method for services (#18867 ) It should be called nomad-services instead of nomad-workloads.	2023-10-26 09:43:33 +02:00
Kerim Satirli	5e1bbf90fc	docs: update all URLs to `developer.hashicorp.com` (#16247 )	2023-10-24 11:00:11 -04:00
Tim Gross	8a311255a2	docs: Consul Workload Identity integration (#18685 ) Documentation updates to support the new Consul integration with Nomad Workload Identity. Included: * Added a large section to the Consul integration docs to explain how to set up auth methods and binding rules (by hand, assuming we don't ship a `nomad setup-consul` tool for now), and how to safely migrate from the existing workflow to the new one. * Move `consul` block out of `group` and onto its own page now that we have it available at the `task` scope, and expanded examples of its use. * Added the `service_identity` and `task_identity` blocks to the Nomad agent configuration, and provided a recommended default. * Added the `identity` block to the `service` block page. * Added a rough compatibility matrix to the Consul integration page.	2023-10-23 09:17:22 -04:00
Jose Merchan	20f6ec75ef	Update consul-connect.mdx (#18575 ) The hyperlink points to a non-existing URL. I suggest change it for this one (https://developer.hashicorp.com/consul/docs/install/ports) which at least listed the port 8503 (grpc tls)	2023-09-26 10:04:54 +01:00
Iwan Aucamp	f122d291d2	docs: fix a sentence in vault-integration.mdx (#18296 )	2023-08-23 11:24:23 +01:00
Seth Hoenig	a7a2a3ce56	docs: move CNI reference plugins installation to CNI overview page (#17068 ) * docs: move CNI reference plugins installation to CNI overview page This PR moves the instruction steps for install the CNI reference plugins from the Consul Mesh integration page to the general Networking CNI page. These plugins are required for bridge networking, not just Consul Mesh, so it makes sense to have them on the general CNI page. Closes #17038 * docs: fix a link to post install steps	2023-05-04 11:32:06 -05:00
Daniel Bennett	fad28e4265	docs: how to troubleshoot consul connect envoy (#15908 ) * largely a doc-ification of this commit message: `d47678074b` this doesn't spell out all the possible failure modes, but should be a good starting point for folks. * connect: add doc link to envoy bootstrap error * add Unwrap() to RecoverableError mainly for easier testing	2023-02-02 14:20:26 -06:00
Daniel Bennett	9f583f57f5	Change `job init` default to example`.nomad.hcl` and recommend in docs (#15997 ) recommend .nomad.hcl for job files instead of .nomad (without .hcl) * nomad job init -> example.nomad.hcl * update docs	2023-02-02 11:47:47 -06:00
Piotr Kazmierczak	949a6f60c7	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
Ashlee M Boyer	3444ece549	docs: Migrate link formats (#15779 ) * Adding check-legacy-links-format workflow * Adding test-link-rewrites workflow * chore: updates link checker workflow hash * Migrating links to new format Co-authored-by: Kendall Strautman <kendallstrautman@gmail.com>	2023-01-25 09:31:14 -08:00
Seth Hoenig	c3017da6af	consul: add client configuration for grpc_ca_file (#15701 ) * [no ci] first pass at plumbing grpc_ca_file * consul: add support for grpc_ca_file for tls grpc connections in consul 1.14+ This PR adds client config to Nomad for specifying consul.grpc_ca_file These changes combined with https://github.com/hashicorp/consul/pull/15913 should finally enable Nomad users to upgrade to Consul 1.14+ and use tls grpc connections. * consul: add cl entgry for grpc_ca_file * docs: mention grpc_tls changes due to Consul 1.14	2023-01-11 09:34:28 -06:00
James Rasell	847c2cc528	client: accommodate Consul 1.14.0 gRPC and agent self changes. (#15309 ) * client: accommodate Consul 1.14.0 gRPC and agent self changes. Consul 1.14.0 changed the way in which gRPC listeners are configured, particularly when using TLS. Prior to the change, a single listener was responsible for handling plain-text and encrypted gRPC requests. In 1.14.0 and beyond, separate listeners will be used for each, defaulting to 8502 and 8503 for plain-text and TLS respectively. The change means that Nomad’s Consul Connect integration would not work when integrated with Consul clusters using TLS and running 1.14.0 or greater. The Nomad Consul fingerprinter identifies the gRPC port Consul has exposed using the "DebugConfig.GRPCPort" value from Consul’s “/v1/agent/self” endpoint. In Consul 1.14.0 and greater, this only represents the plain-text gRPC port which is likely to be disbaled in clusters running TLS. In order to fix this issue, Nomad now takes into account the Consul version and configured scheme to optionally use “DebugConfig.GRPCTLSPort” value from Consul’s agent self return. The “consul_grcp_socket” allocrunner hook has also been updated so that the fingerprinted gRPC port attribute is passed in. This provides a better fallback method, when the operator does not configure the “consul.grpc_address” option. * docs: modify Consul Connect entries to detail 1.14.0 changes. * changelog: add entry for #15309 * fixup: tidy tests and clean version match from review feedback. * fixup: use strings tolower func.	2022-11-21 09:19:09 -06:00
Bryce Kalow	f49b3a95dd	website: fixes redirected links (#14918 )	2022-10-18 10:31:52 -05:00
Bryce Kalow	67d39725b1	website: content updates for developer (#14473 ) Co-authored-by: Geoffrey Grosenbach <26+topfunky@users.noreply.github.com> Co-authored-by: Anthony <russo555@gmail.com> Co-authored-by: Ashlee Boyer <ashlee.boyer@hashicorp.com> Co-authored-by: Ashlee M Boyer <43934258+ashleemboyer@users.noreply.github.com> Co-authored-by: HashiBot <62622282+hashibot-web@users.noreply.github.com> Co-authored-by: Kevin Wang <kwangsan@gmail.com>	2022-09-16 10:38:39 -05:00
Michelle Noorali	b9e084a4b7	doc: explain permissions for Vault sys/capabilties-self	2022-07-06 10:01:30 -04:00
Derek Strickland	e78a5908b9	docker: update images to reference hashicorpdev Docker organization (#12903 ) docker: update images to reference hashicorpdev dockerhub organization generate job_init.bindata_assetfs.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-06-08 15:06:00 -04:00
Anthony	5b80907a5d	docs: added note about vault -period flag (#13185 )	2022-05-31 14:26:03 -07:00
Luiz Aoqui	0abe5a6c79	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00

1 2

63 Commits