Commit Graph

4667 Commits

Author SHA1 Message Date
Luiz Aoqui
cce72cddfd docs: add Autoscaler query_window_offset config (#19942) 2024-02-20 17:01:30 -05:00
Michael Schurter
b3a4c80f8c docs: fix s/envoy-bootstrap-error redirect (#20015)
And cleanup whitespace
2024-02-20 12:26:26 -08:00
Tim Gross
45b2c34532 cni: add DNS set by CNI plugins to task configuration (#20007)
CNI plugins may set DNS configuration, but this isn't threaded through to the
task configuration so that we can write it to the `/etc/resolv.conf` file as
needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're
accessible from the taskrunner. Any DNS entries provided by the user will
override these values.

Fixes: https://github.com/hashicorp/nomad/issues/11102
2024-02-20 10:17:27 -05:00
Tim Gross
c1b5850473 docs: add warning not to enable Consul tls.grpc.verify_incoming (#19970)
Consul does not support incoming TLS verification of Envoy. This failure results
in hard-to-understand errors like `SSLV3_ALERT_BAD_CERTIFICATE` in the Envoy
allocation logs. Leave a warning about this to users.

Closes: https://github.com/hashicorp/nomad/issues/19772
Closes: https://github.com/hashicorp/nomad/issues/16854
Ref: https://github.com/hashicorp/consul/issues/13088
2024-02-14 08:56:35 -05:00
Julien Castets
61941d8204 docs: autoscaler doc for max_scale_up and max_scale_down of target-value strategy (#19945)
See https://github.com/hashicorp/nomad-autoscaler/pull/848
2024-02-13 07:38:39 +00:00
Seth Hoenig
37c497628c docs: describe cloud environments in fingerprint denylist (#19952)
This PR changes the example of the client config option "fingerprint.denylist"
to include all the cloud environment fingerprinters. Each one contains a
2 second HTTP timeout to a metadata endpoint that does not exist if you are not
in that particular cloud. When run in serial on startup, this results in
an 8 second wait where nothing useful is happening.

Closes #16727
2024-02-12 09:57:29 -06:00
Phil Renaud
41c783aec2 Noting action name restrictions, and correcting those of auth methods and roles (#19905) 2024-02-08 12:01:22 -05:00
Luiz Aoqui
2a348ba714 docs: expand impact of verify_https_client=false (#19916)
When Nomad is configured with `verify_https_client=false` endpoints that
do not require an ACL token can be accessed without any other type of
authentication. Expand the docs to mention this effect.
2024-02-08 10:55:40 -05:00
Luiz Aoqui
7391a59695 docs: add note about stub list filtering (#19902)
When filtering list results, the filter expression is applied to the
full object, not the stub. This is useful because it allows users to
filter the list on fields not present in the object stub. But it can
also be confusing because some fields have different names, or only
exist in the stub, so the filter expression needs to reference fields
not present in returned data.

Filtering on the stub would reduce the confusion, but it would also
restrict users to only be able to filter on the fields in the stub,
which, by definition, are just a subset of the original fields.

Documenting this behaviour can help users understand unexpected errors
and results.
2024-02-07 16:41:07 -05:00
Kiara Grouwstra
1e04fc4613 Libraries & SDKs: add nix-nomad (#19808) 2024-02-06 20:47:23 -05:00
Luiz Aoqui
7daa854491 docs: remove duplicate entry for upstreams.config (#19877) 2024-02-06 20:44:02 -05:00
Luiz Aoqui
5825cefe51 docs: remove Docker cpuset_cpus config (#19882)
Nomad 1.7 refactored how CPU cores are assigned to tasks, making the
Docker-specific `cpuset_cpus` configuration no longer used.
2024-02-06 10:51:16 -05:00
Juana De La Cuesta
120c3ca3c9 Add granular control of SELinux labels for host mounts (#19839)
Add new configuration option on task's volume_mounts, to give a fine grained control over SELinux "z" label

* Update website/content/docs/job-specification/volume_mount.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* fix: typo

* func: make volume mount verification happen even on  mounts with no volume

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-02-05 10:05:33 +01:00
Heat Hamilton
0b29a7d727 Update dependencies to match Next v14 in Dev Portal; updated husky workflow to v9; updated nvmrc to v18 2024-01-30 13:43:36 -05:00
Michael Schurter
a283a41613 docs: mention wildcards in namespace api docs (#19809)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2024-01-24 11:52:28 -08:00
Luiz Aoqui
b7fa4447bd docs: autoscaler config for blocking query timeout (#19777) 2024-01-22 13:08:10 -05:00
Adrian Todorov
044eb0e048 docs: warnings about template dependencies, HCL2 clarifications (#19779) 2024-01-19 14:07:15 -05:00
Luiz Aoqui
fce30f342c docs: add lock_namespace autoscaler config (#19769)
Document the `high_availability.lock_namespace` configuration of the
Nomad Autoscaler.
2024-01-18 11:52:14 -05:00
Vijesh
3b4afea974 docs: note script checks don't support some Consul options (#19770)
Script checks don't support Consul's `success_before_passing`, `failures_before_critical`, or `failures_before_warning` because they're run by Nomad and not by Consul
2024-01-18 08:38:57 -05:00
Piotr Kazmierczak
8f99ba6b2c docs: add missing JWT auth method API documentation (#19757) 2024-01-17 16:03:08 +01:00
Tom Davies
5a11a28cac docs: updates link to Consul WLI migration docs (#19748) 2024-01-17 09:57:02 -05:00
Luiz Aoqui
e1e80f383e vault: add new nomad setup vault -check commmand (#19720)
The new `nomad setup vault -check` commmand can be used to retrieve
information about the changes required before a cluster is migrated from
the deprecated legacy authentication flow with Vault to use only
workload identities.
2024-01-12 15:48:30 -05:00
Luiz Aoqui
b2aa6ffd05 docs: fix Consul ACL requirements (#19721)
Even with the new workload identitiy based flow the Nomad servers still
need the `acl = "write"` permission in order to revoke service identity
tokens.
2024-01-11 15:52:23 -05:00
Seth Hoenig
9410c519ff drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration (#19599)
* drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration

* fix tests
2024-01-11 08:20:15 -06:00
CJ
c9cd8480fa docs: considerations for Stateful Workloads (#19077)
Co-authored-by: Adrian Todorov <adrian.todorov@hashicorp.com>
2024-01-10 16:06:45 -05:00
Tim Gross
0935f443dc vault: support allowing tokens to expire without refresh (#19691)
Some users with batch workloads or short-lived prestart tasks want to derive a
Vaul token, use it, and then allow it to expire without requiring a constant
refresh. Add the `vault.allow_token_expiration` field, which works only with the
Workload Identity workflow and not the legacy workflow.

When set to true, this disables the client's renewal loop in the
`vault_hook`. When Vault revokes the token lease, the token will no longer be
valid. The client will also now automatically detect if the Vault auth
configuration does not allow renewals and will disable the renewal loop
automatically.

Note this should only be used when a secret is requested from Vault once at the
start of a task or in a short-lived prestart task. Long-running tasks should
never set `allow_token_expiration=true` if they obtain Vault secrets via
`template` blocks, as the Vault token will expire and the template runner will
continue to make failing requests to Vault until the `vault_retry` attempts are
exhausted.

Fixes: https://github.com/hashicorp/nomad/issues/8690
2024-01-10 14:49:02 -05:00
Luiz Aoqui
5267eec3ad vault: fix token revocation during workflow migration (#19689)
When transitioning from the legacy token-based workflow to the new JWT
workflow for Vault the previous code would instantiate a no-op Vault if
the server configuration had a `default_identity` block.

This no-op client returned an error for some of its operations were
called, such as `LookupToken` and `RevokeTokens`. The original intention
was that, in the new JWT workflow, none of these methods should be
called, so returning an error could help surface potential bugs.

But the `RevokeTokens` and `MarkForRevocation` methods _are_ called even
in the JWT flow. When a leadership transition happens, the new server
looks for unused Vault accessors from state and tries to revoke them.
Similarly, the `RevokeTokens` method is called every time the
`Node.UpdataStatus` and `Node.UpdateAlloc` RPCs are made by clients, as
the Nomad server tries to find unused Vault tokens for the node/alloc.

Since the new JWT flow does not require Nomad servers to contact Vault,
calling `RevokeTokens` and `MarkForRevocation` is not able to complete
without a Vault token, so this commit changes the logic to use the no-op
Vault client when no token is configured. It also updates the client
itself to not error if these methods are called, but to rather just log
so operators can be made aware that there are Vault tokens created by
Nomad that have not been force-expired.

When migrating an existing cluster to the new workload identity based
flow, Nomad operators must first upgrade the Nomad version without
removing any of the existing Vault configuration. Doing so can prevent
Nomad servers from managing and cleaning-up existing Vault tokens during
a leadership transition and node or alloc updates.

Operators must also resubmit all jobs with a `vault` block so they are
updated with an `identity` for Vault. Skipping this step may cause
allocations to fail if their Vault token expires (if, for example, the
Nomad client stops running for TTL/2) or if they are rescheduled, since
the new client will try to follow the legacy flow which will fail if the
Nomad server configuration for Vault has already been updated to remove
the Vault address and token.
2024-01-10 13:28:46 -05:00
Tim Gross
d3e5cae1eb consul: support admin partitions (#19665)
Add support for Consul Enterprise admin partitions. We added fingerprinting in
https://github.com/hashicorp/nomad/pull/19485. This PR adds a `consul.partition`
field. The expectation is that most users will create a mapping of Nomad node
pool to Consul admin partition. But we'll also create an implicit constraint for
the fingerprinted value.

Fixes: https://github.com/hashicorp/nomad/issues/13139
2024-01-10 10:41:29 -05:00
Daniel Peinhopf
9eb357020d Docs: Alternative IIS Task Driver (#19411) 2024-01-10 14:14:30 +00:00
Seth Hoenig
cb7d078c1d drivers/raw_exec: enable configuring raw_exec task to have no memory limit (#19670)
* drivers/raw_exec: enable configuring raw_exec task to have no memory limit

This PR makes it possible to configure a raw_exec task to not have an
upper memory limit, which is how the driver would behave pre-1.7.

This is done by setting memory_max = -1. The cluster (or node pool) must
have memory oversubscription enabled.

* cl: add cl
2024-01-09 14:57:13 -06:00
Egor Mikhailov
18f49e015f auth: add new optional OIDCDisableUserInfo setting for OIDC auth provider (#19566)
Add new optional `OIDCDisableUserInfo` setting for OIDC auth provider which
disables a request to the identity provider to get OIDC UserInfo.

This option is helpful when your identity provider doesn't send any additional
claims from the UserInfo endpoint, such as Microsoft AD FS OIDC Provider:

> The AD FS UserInfo endpoint always returns the subject claim as specified in the
> OpenID standards. AD FS doesn't support additional claims requested via the
> UserInfo endpoint

Fixes #19318
2024-01-09 13:41:46 -05:00
Tim Gross
c875f3e49a docs: expand docs on implicit ACL capabilities grants (#19681)
An audit of Nomad's ACLs resulted in some confusion around whether the
`NamespaceValidator` method is conjunctive ("add", as implied by the docs) or
disjunctive ("or", as it is by design). Clarify the ACL documentation as
follows:

* Call out where fine-grained capabilities imply grants to other
  capabilities (for example, that `csi-read-volume` grants `csi-list-volume`).
* Fix an incorrectly documented ACL requirement for the CSI List External
  Volumes API.
* Clarify how ACLs are expected to work for the two search API endpoints, such
  that you need list/read access to the objects in the search context.
2024-01-09 13:25:05 -05:00
Tim Gross
a399f16a31 docs: describe cgroup controller requirements (#19493)
Nomad can only use cgroups to control resource requirements if all the cgroups
controllers are actually enabled. Add this to our requirements documentation as
well as the impacted `exec` and `java` task drivers.
2024-01-08 10:01:14 -05:00
am-ak
7dc82f233f [DOCS] Update docker.mdx (#19657)
Removed info regarding development of Nomad
2024-01-08 14:32:57 +00:00
Shantanu Gadgil
6bbd3b0cec reschedule is at group level (#19653)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2024-01-08 10:54:52 +00:00
Seth Hoenig
4b3ee77d6b docs: update raw_exec driver docs and 1.7 upgrade notes (#19598) 2024-01-04 08:26:46 -06:00
Seth Hoenig
ccfb13a72d e2e: add test for raw_exec memory_max configuration (#19596)
* e2e: add test for raw_exec memory_max configuration

* docs: note raw_exec supports memory_max in resources documentation
2024-01-04 08:25:56 -06:00
Tim Gross
e7ca2b51ad vault: ignore allow_unauthenticated config if identity is set (#19585)
When the server's `vault` block has a default identity, we don't check the
user's Vault token (and in fact, we warn them on job submit if they've provided
one). But the validation hook still checks for a token if
`allow_unauthenticated` is set to true. This is a misconfiguration but there's
no reason for Nomad not to do the expected thing here.

Fixes: https://github.com/hashicorp/nomad/issues/19565
2024-01-02 16:46:34 -05:00
Luiz Aoqui
cd8a03431c docs: add scale_in_protection to AWS Autoscaler (#19546)
Document new `scale_in_protection` configuration of the AWS ASG
Autoscaler target plugin.
2024-01-02 14:48:56 -05:00
Luiz Aoqui
0bef6f05a2 docs: add note about * namespace on autoscaling (#19547)
Explain the behaviour when the wildcard namespace value `*` is used to
configure the Nomad Autoscaler agent.
2024-01-02 14:48:20 -05:00
Luiz Aoqui
7eecca65ec docs: add autoscaler AWS retry_attempts config (#19549)
Document the Nomad Autoscaler AWS target plugin config `retry_attempts`.
2024-01-02 14:08:10 -05:00
Luiz Aoqui
56b1bf3240 docs: add policy_id and target_name metric labels (#19551) 2024-01-02 14:06:37 -05:00
Luiz Aoqui
1694e69b77 docs: clarify the behaviour of lower_bound and upper_bound (#19552) 2024-01-02 14:06:07 -05:00
Luiz Aoqui
09731442e4 docs: add node_pool autoscaler node selector (#19548)
Document the `node_pool` node selector configuration.
2024-01-02 10:19:58 -05:00
James Rasell
76ba3e10e7 docs: add Nomad Autoscaler HA configuration details. (#19010)
Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2023-12-27 08:00:07 +00:00
Mike Nomitch
dd15bdff9c Adds vault role to JWT claims if specified in jobspec (#19535) 2023-12-20 15:51:34 -08:00
Piotr Kazmierczak
84115d732d docs: correct Nomad Autoscaler example link in HA vars documentation (#19537) 2023-12-20 16:26:35 +01:00
Luiz Aoqui
95766aaa1b docs: add Submission parameter to job update (#19516) 2023-12-19 10:09:16 -05:00
Etienne Bruines
f18d5c7c32 docs: fix migration to workload identity links (#19508)
Fixes #19507
2023-12-18 21:27:38 -05:00
Tim Gross
14200a800f docs: note replacement of - characters in meta env vars (#19501)
The keys of `meta` fields have all characters outside of `[A-Za-z0-9_.]`
replaced by underscores when we create `NOMAD_META` environment variables. Make
sure this replacement is documented.

Fixes: https://github.com/hashicorp/nomad/issues/15359
2023-12-15 15:48:23 -05:00