Commit Graph

143 Commits

Author SHA1 Message Date
Seth Hoenig
8ae1a0e356 docs: add docs around dynamic workload users (#20477) 2024-04-23 07:57:40 -05:00
astudentofblake
7b7ed12326 func: Allow custom paths to be added the the getter landlock (#20349)
* func: Allow custom paths to be added the the getter landlock

Fixes: 20315

* fix: slices imports
fix: more meaningful examples
fix: improve documentation
fix: quote error output
2024-04-11 15:17:33 -05:00
Tim Gross
8298d39e78 Connect transparent proxy support
Add support for Consul Connect transparent proxies

Fixes: https://github.com/hashicorp/nomad/issues/10628
2024-04-10 11:00:18 -04:00
Tim Gross
e2e561da88 tproxy: documentation improvements 2024-04-10 08:55:50 -04:00
Tim Gross
8eaf176868 client: fix IPv6 parsing for client.servers block (#20324)
When the `client.servers` block is parsed, we split the port from the
address. This does not correctly handle IPv6 addresses when they are in URL
format (wrapped in brackets), which we require to disambiguate the port and
address.

Fix the parser to correctly split out the port and handle a missing port value
for IPv6. Update the documentation to make the URL format requirement clear.

Fixes: https://github.com/hashicorp/nomad/issues/20310
2024-04-08 15:06:27 -04:00
James Rasell
fd5a42a6ca docs: clarify data dir default parameters and default creation. (#20268) 2024-04-04 09:20:47 +01:00
James Rasell
facc3e8013 agent: allow configuration of in-memory telemetry sink. (#20166)
This change adds configuration options for setting the in-memory
telemetry sink collection and retention durations. This sink backs
the metrics JSON API and previously had hard-coded default values.

The new options are particularly useful when running development or
debug environments, where metrics collection is desired at a fast
and granular rate.
2024-03-25 15:00:18 +00:00
Tim Gross
d3ddb0aa49 docs: make it clear that federation features require ACLs (#20196)
Our documentation has a hidden assumption that users know that federation
replication requires ACLs to be enabled and bootstrapped. Add notes at some of
the places users are likely to look for it.

A separate follow-up PR to the federation tutorial should point to the ACL
multi-region tutorial as well.

Fixes: https://github.com/hashicorp/nomad/issues/20128
2024-03-22 15:15:00 -04:00
Juana De La Cuesta
56bf253474 Add docs for disconnected block (#20147)
Expand the job settings to include the disconnect block and set as deprecated the fields that will be replaced by it.
2024-03-20 10:08:16 +01:00
Tim Gross
dc39c20e66 docs: make recommendation for collection interval vs scrape interval (#20056)
Metrics tools that "pull" metrics, such as Prometheus, have a configurable
interval for how frequently they scrape metrics. This should be greater or equal
to the Nomad `telemetry.collection_interval` to avoid re-scraping metrics that
cannot have been updated in that interval.

Fixes: https://github.com/hashicorp/nomad/issues/20055
2024-03-19 08:56:29 -04:00
Tim Gross
695bb7ffcf docs: improve wording around autoconfiguration via Consul (#20139)
Fixes: https://github.com/hashicorp/nomad/issues/20132
2024-03-15 08:44:58 -04:00
Tim Gross
c1b5850473 docs: add warning not to enable Consul tls.grpc.verify_incoming (#19970)
Consul does not support incoming TLS verification of Envoy. This failure results
in hard-to-understand errors like `SSLV3_ALERT_BAD_CERTIFICATE` in the Envoy
allocation logs. Leave a warning about this to users.

Closes: https://github.com/hashicorp/nomad/issues/19772
Closes: https://github.com/hashicorp/nomad/issues/16854
Ref: https://github.com/hashicorp/consul/issues/13088
2024-02-14 08:56:35 -05:00
Seth Hoenig
37c497628c docs: describe cloud environments in fingerprint denylist (#19952)
This PR changes the example of the client config option "fingerprint.denylist"
to include all the cloud environment fingerprinters. Each one contains a
2 second HTTP timeout to a metadata endpoint that does not exist if you are not
in that particular cloud. When run in serial on startup, this results in
an 8 second wait where nothing useful is happening.

Closes #16727
2024-02-12 09:57:29 -06:00
Tom Davies
5a11a28cac docs: updates link to Consul WLI migration docs (#19748) 2024-01-17 09:57:02 -05:00
Tim Gross
e7ca2b51ad vault: ignore allow_unauthenticated config if identity is set (#19585)
When the server's `vault` block has a default identity, we don't check the
user's Vault token (and in fact, we warn them on job submit if they've provided
one). But the validation hook still checks for a token if
`allow_unauthenticated` is set to true. This is a misconfiguration but there's
no reason for Nomad not to do the expected thing here.

Fixes: https://github.com/hashicorp/nomad/issues/19565
2024-01-02 16:46:34 -05:00
Mike Nomitch
31f4296826 Adds support for failures before warning to Consul service checks (#19336)
Adds support for failures before warning and failures before critical
to the automatically created Nomad client and server services in Consul
2023-12-14 11:33:31 -08:00
James Rasell
94b8b7769a docs: add reporting config block documentation. (#19470) 2023-12-14 15:11:29 +00:00
Charlie Voiselle
d2fc7cc0c4 [docs] Note reboot to update bridge_network_hairpin_mode (#19304) 2023-12-12 19:49:15 -05:00
Luiz Aoqui
99d72b7154 docs: fix placement of Consul auth method configs (#19404)
The auth method names are used by Nomad clients, not servers.
2023-12-11 09:16:57 -05:00
Tim Gross
1e51379e56 docs: clarify behavior and recommendations for mTLS vs TLS for HTTP (#19282)
Some of our documentation on `tls` configuration could be more clear as to
whether we're referring to mTLS or TLS. Also, when ACLs are enabled it's fine to
have `verify_https_client=false` (the default). Make it clear that this is an
acceptably secure configuration and that it's in fact recommended in order to
avoid pain of distributing client certs to user browsers.
2023-12-04 15:03:43 -05:00
James Rasell
d041ddc4ee docs: fix up HCL formatting on agent config examples. (#19254) 2023-12-04 08:44:00 +00:00
Luiz Aoqui
125dd4af38 docs: small updates to agent consul (#19285) 2023-12-01 16:40:06 -05:00
Seth Hoenig
b83c1e14c1 docs: fix documentation of client.reserved.cores (#19266) 2023-12-01 13:06:55 -06:00
Tim Gross
2ba459c73a docs: split consul config params into client vs server sections (#19258)
Some sections of the `consul` configuration are relevant only for clients or
servers. We updated our Vault docs to split these parameters out into their own
sections for clarity. Match that for the Consul docs.
2023-12-01 13:37:39 -05:00
Piotr Kazmierczak
248b2ba5cd WI: use single auth method for Consul by default (#19169)
This simplifies the default setup of Nomad workloads WI-based
authentication for Consul by using a single auth method with 2 binding rules.

Users can still specify separate auth methods for services and tasks.
2023-11-28 12:22:27 +01:00
Luiz Aoqui
5ff6cce3ab vault: update default JWT auth method path (#19188)
Update default auth method path to be `jwt-nomad` to avoid potential
conflicts when Vault's `jwt` default is already being used for something
else.
2023-11-27 17:48:12 -05:00
Adriano Caloiaro
f66eb83fc0 Add go-netaddrs support to retry_join (#18745) 2023-11-15 10:07:18 -05:00
James Rasell
4ec27a97d1 docs: clarify ACL agent config TTL params apply to auth methods. (#18949) 2023-11-01 13:45:13 +00:00
Tim Gross
9463d7f88a docs: add note about consul.service_identity ignoring fields (#18900)
The WI we get for Consul services is saved to the client state DB like all other
WIs, but the resulting JWT is never exposed to the task secrets directory
because (a) it's only intended for use with Consul service configuration,
and (b) for group services it could be ambiguous which task to expose it to.

Add a note to the `consul.service_identity` docs that these fields are ignored.
2023-10-30 09:19:15 -04:00
Piotr Kazmierczak
7f62dec473 consul WI: rename default auth method for services (#18867)
It should be called nomad-services instead of nomad-workloads.
2023-10-26 09:43:33 +02:00
Tim Gross
8a311255a2 docs: Consul Workload Identity integration (#18685)
Documentation updates to support the new Consul integration with Nomad Workload
Identity. Included:

* Added a large section to the Consul integration docs to explain how to set up
  auth methods and binding rules (by hand, assuming we don't ship a `nomad
  setup-consul` tool for now), and how to safely migrate from the existing
  workflow to the new one.
* Move `consul` block out of `group` and onto its own page now that we have it
  available at the `task` scope, and expanded examples of its use.
* Added the `service_identity` and `task_identity` blocks to the Nomad agent
  configuration, and provided a recommended default.
* Added the `identity` block to the `service` block page.
* Added a rough compatibility matrix to the Consul integration page.
2023-10-23 09:17:22 -04:00
Michael Schurter
a806363f6d OpenID Configuration Discovery Endpoint (#18691)
Added the [OIDC Discovery](https://openid.net/specs/openid-connect-discovery-1_0.html) `/.well-known/openid-configuration` endpoint to Nomad, but it is only enabled if the `server.oidc_issuer` parameter is set. Documented the parameter, but without a tutorial trying to actually _use_ this will be very hard.

I intentionally did *not* use https://github.com/hashicorp/cap for the OIDC configuration struct because it's built to be a *compliant* OIDC provider. Nomad is *not* trying to be compliant initially because compliance to the spec does not guarantee it will actually satisfy the requirements of third parties. I want to avoid the problem where in an attempt to be standards compliant we ship configuration parameters that lock us in to a certain behavior that we end up regretting. I want to add parameters and behaviors as there's a demonstrable need.

Users always have the escape hatch of providing their own OIDC configuration endpoint. Nomad just needs to know the Issuer so that the JWTs match the OIDC configuration. There's no reason the actual OIDC configuration JSON couldn't live in S3 and get served directly from there. Unlike JWKS the OIDC configuration should be static, or at least change very rarely.

This PR is just the endpoint extracted from #18535. The `RS256` algorithm still needs to be added in hopes of supporting third parties such as [AWS IAM OIDC Provider](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html).

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-10-20 17:11:41 -07:00
James Rasell
1ffdd576bb agent: add config option to enable file and line log detail. (#18768) 2023-10-16 15:59:16 +01:00
Luiz Aoqui
a4b29a29cb vault: add jwt_backend_path agent config (#18606)
Add agent configuration to allow cluster operators to define the path
where the JWT auth method backend is mounted.
2023-09-28 18:02:30 -03:00
Luiz Aoqui
fed1992cea vault: remove use_identity agent config (#18592)
The initial intention behind the `vault.use_identity` configuration was
to indicate to Nomad servers that they would need to sign a workload
identities for allocs with a `vault` block.

But in order to support identity renewal, #18262 and #18431 moved the
token signing logic to the alloc runner since a new token needs to be
signed prior to the TTL expiring.

So #18343 implemented `use_identity` as a flag to indicate that the
workload identity JWT flow should be used when deriving Vault tokens for
tasks.

But this configuration value is set on servers so it is not available to
clients at the time of token derivation, making its meaning not clear: a
job may end up using the identity-based flow even when `use_identity` is
`false`.

The only reliable signal available to clients at token derivation time
is the presence of an `identity` block for Vault, and this is already
configured with the `vault.default_identity` configuration block, making
`vault.use_identity` redundant.

This commit removes the `vault.use_identity` configuration and
simplifies the logic on when an implicit Vault identity is injected into
tasks.
2023-09-27 17:44:07 -03:00
Luiz Aoqui
868aba57bb vault: update identity name to start with vault_ (#18591)
* vault: update identity name to start with `vault_`

In the original proposal, workload identities used to derive Vault
tokens were expected to be called just `vault`. But in order to support
multiple Vault clusters it is necessary to associate identities with
specific Vault cluster configuration.

This commit implements a new proposal to have Vault identities named as
`vault_<cluster>`.
2023-09-27 15:53:28 -03:00
Luiz Aoqui
5f951d506a docs: update Vault config for workload identity (#18503)
Update documentation for the agent configuration `vault` block for
workload identity support.
2023-09-14 19:38:36 -03:00
Tim Gross
77ca0bb8af docs: support multiple Vault and Consul clusters (ENT-only) (#18432)
This changeset is the documentation for supporting multiple Vault and Consul
clusters in Nomad Enterprise. It includes documentation changes for the agent
configuration (#18255), the namespace specification (#18425), and the vault,
consul, and service blocks of the jobspec (#18409).
2023-09-12 09:33:14 -04:00
Luiz Aoqui
e69e3c6677 docs: expand on where node_class may be used (#18288) 2023-08-23 15:59:43 -04:00
Esteban Barrios
65d562b760 config: add configurable content security policy (#18085) 2023-08-14 14:23:03 -04:00
Devashish Taneja
472693d642 server: add config to tune job versions retention. #17635 (#17939) 2023-08-07 14:47:40 -04:00
louievandyke
0d343f269a docs: updating to specify mTLS rpc endpoints (#17963) 2023-07-19 14:16:35 -04:00
James Rasell
079f5d4d8d docs: detail Consul ACL token env var config option. (#17859) 2023-07-10 14:26:18 +01:00
Luiz Aoqui
5db9e64cdd node pool: node pool upsert on multiregion node register (#17503)
When registering a node with a new node pool in a non-authoritative
region we can't create the node pool because this new pool will not be
replicated to other regions.

This commit modifies the node registration logic to only allow automatic
node pool creation in the authoritative region.

In non-authoritative regions, the client is registered, but the node
pool is not created. The client is kept in the `initialing` status until
its node pool is created in the authoritative region and replicated to
the client's region.
2023-06-13 11:28:28 -04:00
Luiz Aoqui
81f0b359dd node pools: register a node in a node pool (#17405) 2023-06-02 17:50:50 -04:00
Chris van Meer
c7c8bd4ce2 Updates to the UI block (#16328)
1. On the Consul address, following the recommendation for the HTTPS
   API on port 8501.
2. Add the hint to use HEX values for the colors.
2023-04-18 18:28:17 -07:00
Tim Gross
c3002db815 client: allow drain_on_shutdown configuration (#16827)
Adds a new configuration to clients to optionally allow them to drain their
workloads on shutdown. The client sends the `Node.UpdateDrain` RPC targeting
itself and then monitors the drain state as seen by the server until the drain
is complete or the deadline expires. If it loses connection with the server, it
will monitor local client status instead to ensure allocations are stopped
before exiting.
2023-04-14 15:35:32 -04:00
Seth Hoenig
2c44cbb001 api: enable support for setting original job source (#16763)
* api: enable support for setting original source alongside job

This PR adds support for setting job source material along with
the registration of a job.

This includes a new HTTP endpoint and a new RPC endpoint for
making queries for the original source of a job. The
HTTP endpoint is /v1/job/<id>/submission?version=<version> and
the RPC method is Job.GetJobSubmission.

The job source (if submitted, and doing so is always optional), is
stored in the job_submission memdb table, separately from the
actual job. This way we do not incur overhead of reading the large
string field throughout normal job operations.

The server config now includes job_max_source_size for configuring
the maximum size the job source may be, before the server simply
drops the source material. This should help prevent Bad Things from
happening when huge jobs are submitted. If the value is set to 0,
all job source material will be dropped.

* api: avoid writing var content to disk for parsing

* api: move submission validation into RPC layer

* api: return an error if updating a job submission without namespace or job id

* api: be exact about the job index we associate a submission with (modify)

* api: reword api docs scheduling

* api: prune all but the last 6 job submissions

* api: protect against nil job submission in job validation

* api: set max job source size in test server

* api: fixups from pr
2023-04-11 08:45:08 -05:00
Tim Gross
be0678bd84 docs: fix template retry attempts default documentation (#16667)
The configuration docs for `client.template.vault_retry`, `consul_retry`, and
`nomad_retry` incorrectly document the default number of attempts to be
unlimited (0). When we added these config blocks, we defaulted the fields to
`nil` for backwards compatibility, which causes them to fall back to the default
consul-template configuration values.
2023-03-28 08:27:06 -04:00
James Rasell
36825576b4 docs: fix-up legacy link in client config page. (#16678) 2023-03-28 09:32:34 +01:00