Commit Graph

26085 Commits

Author SHA1 Message Date
Aimee Ukasick
3d06eef65d Docs: CE-705 Highlight that user must backup keyring separately 2024-08-26 11:25:26 -05:00
Daniel Bennett
a6e29057d6 networking: refactor some iptables for testability (#23856) 2024-08-23 10:05:46 -05:00
Sujata Roy
36522ec632 Merge pull request #23850 from hashicorp/Nomad-NET-9394
command/debug: capture more logs by default
2024-08-22 10:43:28 -07:00
Michael Schurter
3572fd58cf docs: add cl for #23850 2024-08-22 09:19:05 -07:00
Michael Schurter
8b0a88e2f7 docs: update defaults for operator debug 2024-08-22 09:17:03 -07:00
Seth Hoenig
8b093a6a5d scheduler: support for device - aware numa scheduling (#1760) (#23837)
(CE backport of ENT 59433d56c7215c0b8bf33764f41b57d9bd30160f (without ent files))

* scheduler: enhance numa aware scheduling with support for devices

* cr: add comments
2024-08-20 07:53:04 -05:00
Phil Renaud
5fcec1f8cc [ui] Show "Scaled Down" as a valid job status when task groups' counts are set to zero (#23829)
* Scaled Down as a status

* Scaled Down as a steady-state job panel status as well

* Test for badge status and changelog
2024-08-19 13:45:19 -04:00
Seth Hoenig
4aeb279534 e2e: fix module name of an artifact we download (#23843)
Because this will definitely never change again, for sure, trust me.
2024-08-19 10:25:35 -05:00
Phil Renaud
fbd8d62955 Check for target on click to prevent double-opening cmd+clicked links on jobs index (#23832) 2024-08-19 10:20:24 -04:00
Florian Apolloner
d6be784e2d namespaces: add allowed network modes to capabilities. (#23813) 2024-08-16 09:47:19 -04:00
Piotr Kazmierczak
0bc9796d3b client: log an error message if total detected cpu is zero (#23827) 2024-08-15 18:31:27 +02:00
Piotr Kazmierczak
f8e7905e24 docs: dmidecode manual installation as post-install step (#23823)
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2024-08-15 17:14:16 +02:00
Tim Gross
682c8c0c81 cgroupslib: allow initial controller check with delegated cgroups v2 (#23803)
During Nomad client initialization with cgroups v2, we assert that the required
cgroup controllers are available in the root `cgroup.subtree_control` file by
idempotently writing to the file. But if Nomad is running with delegated
cgroups, this will fail file permissions checks even if the subtree control file
already has the controllers we need.

Update the initialization to first check if the controllers are missing before
attempting to write to them. This allows cgroup delegation so long as the
cluster administrator has pre-created a Nomad owned cgroups tree and set the
`Delegate` option in a systemd override. If not, initialization fails in the
existing way.

Although this is one small step along the way to supporting a rootless Nomad
client, running Nomad as non-root is still unsupported. I've intentionally not
documented setting up cgroup delegation in this PR, as this PR is insufficient
by itself to have a secure and properly-working rootless Nomad client.

Ref: https://github.com/hashicorp/nomad/issues/18211
Ref: https://github.com/hashicorp/nomad/issues/13669
2024-08-14 16:58:21 -04:00
Tim Gross
6aa503f2bb docker: disable cpuset management for non-root clients (#23804)
Nomad clients manage a cpuset cgroup for each task to reserve or share CPU
cores. But Docker owns its own cgroups, and attempting to set a parent cgroup
that Nomad manages runs into conflicts with how runc manages cgroups via
systemd. Therefore Nomad must run as root in order for cpuset management to ever
be compatible with Docker.

However, some users running in unsupported configurations felt that the changes
we made in Nomad 1.7.0 to ensure Nomad was running correctly represented a
regression. This changeset disables cpuset management for non-root Nomad
clients. When running Nomad as non-root, the driver will not longer reconcile
cpusets with Nomad and `resources.cores` will behave incorrectly (but the driver
will still run).

Although this is one small step along the way to supporting a rootless Nomad
client, running Nomad as non-root is still unsupported. This PR is insufficient
by itself to have a secure and properly-working rootless Nomad client.

Ref: https://github.com/hashicorp/nomad/issues/18211
Ref: https://github.com/hashicorp/nomad/issues/13669
Ref: https://hashicorp.atlassian.net/browse/NET-10652
Ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md
2024-08-14 16:44:13 -04:00
Martijn Vegter
aded4b3500 docs: remove remaining references to network_speed config (#23792) 2024-08-14 14:14:38 -04:00
Martina Santangelo
73ce56ba27 networking: refactor building nomad bridge config (#23772) 2024-08-14 12:43:31 -05:00
Seth Hoenig
db0642099e build: update golangci-lint to 1.60.1 (#23807)
* build: update golangci-lint to 1.60.1

* ci: update golangci-lint to v1.60.1

Helps with go1.23 compatability. Introduces some breaking changes / newly
enforced linter patterns so those are fixed as well.
2024-08-14 10:09:31 -05:00
Seth Hoenig
f89335e01b build: update to go1.22.6 (#23805) 2024-08-14 09:20:14 -05:00
Seth Hoenig
0bcfd9a266 build: apt update before apt install (#23806) 2024-08-14 08:58:15 -05:00
Piotr Kazmierczak
c1362c03df docs: minimal Consul policy for Nomad agents needs node:write (#23800) 2024-08-13 17:53:21 +02:00
Piotr Kazmierczak
b34a6fe10b Merge pull request #23797 from hashicorp/post-1.8.3-release
Post 1.8.3 release
2024-08-13 14:36:01 +02:00
Piotr Kazmierczak
0ab7e2219a assets rebuild 2024-08-13 12:48:08 +02:00
Piotr Kazmierczak
c021857659 Merge release 1.8.3 files 2024-08-13 12:23:40 +02:00
hc-github-team-nomad-core
7c29e7cb7b Prepare for next release 2024-08-13 12:21:21 +02:00
hc-github-team-nomad-core
8489dadf57 Generate files for 1.8.3 release 2024-08-13 12:21:21 +02:00
Tim Gross
b7419bc940 api: only set url field in config if previously unset (#23785)
In #16872 we added support for configuring the API client with a unix domain
socket. In order to set the host correctly, we parse the address before mutating
the Address field in the configuration. But this prevents the configuration from
being reused across multiple clients, as the next time we parse the address it
will no longer be pointing to the socket. This breaks consumers like the
autoscaler, which reuse the API config between plugins.

Update the `NewClient` constructor to only override the `url` field if it hasn't
already been parsed. Include a test demonstrating safe reuse with a unix domain
socket.

Ref: https://github.com/hashicorp/nomad-autoscaler/issues/944
Ref: https://github.com/hashicorp/nomad-autoscaler/pull/945
2024-08-09 13:28:04 -04:00
Tim Gross
920f4702d6 testing: fix skip comment on RequireWindows helper (#23776) 2024-08-09 09:07:25 -04:00
Tim Gross
ef116b12d5 metrics: add client.tasks state metrics (#23773)
Although we have `client.allocations` metrics to track allocation states on a
client, having separate metrics for `client.tasks` will allow operators to
identify that there are individual tasks in an unexpected state in an otherwise
healthy allocation.

Fixes: https://github.com/hashicorp/nomad/issues/23770
2024-08-09 09:02:17 -04:00
VPanteleev-S7
2e5d6192a7 docs: illustrate how to use the obtained token (#19557)
Currently, the page doesn't explain how to do things as the logged in user.
2024-08-08 15:35:26 -04:00
Kartik Prajapati
3a3e63e2e1 cli: add role update functionality to acl token update (#18532) 2024-08-08 15:33:36 -04:00
Farbod Ahmadian
bb4c4fbd49 cli: show warning when creating token if policy doesn't exist (#16437) 2024-08-08 11:04:55 -04:00
Tim Gross
9543e740af docker: fix delimiter for selinux label for read-only volumes (#23750)
The Docker driver's `volume` field to specify bind-mounts takes a list of
strings that consist of three `:`-delimited fields: source, destination, and
options. We append the SELinux label from the plugin configuration as the third
field. But when the user has already specified the volume is read-only with
`:ro`, we're incorrectly appending the SELinux label with another `:` instead of
the required `,`.

Combine the options into a single field value before appending them to the bind
mounts configuration. Updated the tests to split out Windows behavior (which
doesn't accept options) and to ensure the test task has the expected environment
for bind mounts.

Fixes: https://github.com/hashicorp/nomad/issues/23690
2024-08-08 09:08:01 -04:00
johncooler
3214c2bd62 docs: remove duplicate config option (#23768) 2024-08-08 08:25:25 -04:00
Tim Gross
bcb0ee3031 identity: prevent missing task identity from panicking server (#23763)
Tasks have a default identity created during canonicalization. If this default
identity is somehow missing, we'll hit panics when trying to create and sign the
claims in the plan applier. Fallback to the default identity if it's missing
from the task.

This changeset will need a different implementation in 1.7.x+ent backports, as
the constructor for identities was refactored significantly in #23708. The panic
cannot occur in 1.6.x+ent.

Fixes: https://github.com/hashicorp/nomad/issues/23758
2024-08-07 15:26:20 -04:00
Daniel Bennett
d131c41943 cni: network.cni job updates should replace allocs (#23764)
a change to the network{cni{}} block means that
the user wants the network config to change,
and that only happens during initial alloc setup,
so we need to replace the alloc(s) to get
fresh network(s) to reconfigure from scratch.

e.g. a job plan diff like this

```
+/- Task Group: "g" (1 in-place update)
  + Network {
    + CNIConfig {
      + a: "ayy"
      }
```

should instead be

```
+/- Task Group: "g" (1 create/destroy update)
  + Network {
    + CNIConfig {
      + a: "ayy"
      }
```
2024-08-07 12:13:11 -05:00
Aimee Ukasick
20511fa64d docs: Clarify namespace rules matching criteria. (#23752)
Clarify how Nomad evaluates policy rules.

Fixes: #20118
Jira: https://hashicorp.atlassian.net/browse/CE-695

Related tutorial PR: https://github.com/hashicorp/tutorials/pull/2205
2024-08-07 09:28:38 -04:00
Tim Gross
4a5921cb16 acl: disallow leading / on variable paths (#23757)
The path for a Variable never begins with a leading `/`, because it's stripped
off in the API before it ever gets to the state store. The CLI and UI allow the
leading `/` for convenience, but this can be misleading when it comes to writing
ACL policies. An ACL policy with a path starting with a leading `/` will never
match.

Update the ACL policy parser so that we prevent an incorrect variable path in
the policy.

Fixes: https://github.com/hashicorp/nomad/issues/23730
2024-08-07 09:26:18 -04:00
Juana De La Cuesta
218fa82d02 Merge pull request #23756 from hashicorp/backport-versions
build: revert change to active versions
2024-08-07 08:59:14 +02:00
Tim Gross
8a774702c7 build: revert change to active versions
In #23720 we removed the 1.7.x and 1.6.x release branches from the "active"
versions for backports. But we forgot that we needed these backports to remain
active for documentation backports because of versioned docs.
2024-08-06 16:33:01 -04:00
Aimee Ukasick
021692eccf docs: refactor CNI plugin content (#23707)
- Pulled common content from multiple pages into new partials
- Refactored install/index to be OS-based so I could add linux-distro-based instructions to install-consul-cni-plugins.mdx partial. The tab groups on the install/index page do match and change focus as expected.
- Moved CNI overview-type content to networking/index
- Refactored networking/cni to include install CNI plugins and configuration content (from install/index).
- Moved CNI plugins explanation in bridge mode configuration section into bullet points. They had been #### headings, which aren't rendered in the R page TOC. I tried to simplify and format the bullet point content to be easier to scan.

Ref: https://hashicorp.atlassian.net/browse/CE-661
Fixes: https://github.com/hashicorp/nomad/issues/23229
Fixes: https://github.com/hashicorp/nomad/issues/23583
2024-08-06 14:47:46 -04:00
Deniz Onur Duzgun
0f7b8698ec security: fix write symlink escape on the same allocdir path (#23738)
Resolves symlink escape when unarchiving by removing existing paths within the same allocation directory which can occur by writing a header that points to a symlink that lives outside of the sandbox environment. This exploit requires first compromising the Nomad client agent at the source allocation.

Ref: https://hashicorp.atlassian.net/browse/NET-10607
Ref: https://github.com/hashicorp/nomad-enterprise/pull/1725
2024-08-05 16:23:27 -04:00
Tim Gross
b25f1b66ce resources: allow job authors to configure size of secrets tmpfs (#23696)
On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks
need larger space for downloading large secrets. This is especially the case for
tasks using `templates`, which need extra room to write a temporary file to the
secrets directory that gets renamed to the old file atomically.

This changeset allows increasing the size of the tmpfs in the `resources`
block. Because this is a memory resource, we need to include it in the memory we
allocate for scheduling purposes. The task is already prevented from using more
memory in the tmpfs than the `resources.memory` field allows, but can bypass
that limit by writing to the tmpfs via `template` or `artifact` blocks.

Therefore, we need to account for the size of the tmpfs in the allocation
resources. Simply adding it to the memory needed when we create the allocation
allows it to be accounted for in all downstream consumers, and then we'll
subtract that amount from the memory resources just before configuring the task
driver.

For backwards compatibility, the default value of 1MiB is "free" and ignored by
the scheduler. Otherwise we'd be increasing the allocated resources for every
existing alloc, which could cause problems across upgrades. If a user explicitly
sets `resources.secrets = 1` it will no longer be free.

Fixes: https://github.com/hashicorp/nomad/issues/2481
Ref: https://hashicorp.atlassian.net/browse/NET-10070
2024-08-05 16:06:58 -04:00
Tim Gross
e684636aed cli: add option to return original HCL in job inspect command (#23699)
In 1.6.0 we shipped the ability to review the original HCL in the web UI, but
didn't follow-up with an equivalent in the command line. Add a `-hcl` flag to
the `job inspect` command.

Closes: https://github.com/hashicorp/nomad/issues/6778
2024-08-05 15:35:18 -04:00
Tim Gross
bc50eebebd workload identity: add support for extra claims config for Vault (#23675)
Although we encourage users to use Vault roles, sometimes they're going to want
to assign policies based on entity and pre-create entities and aliases based on
claims. This allows them to use single default role (or at least small number of
them) that has a templated policy, but have an escape hatch from that.

When defining Vault entities the `user_claim` must be unique. When writing Vault
binding rules for use with Nomad workload identities the binding rule won't be
able to create a 1:1 mapping because the selector language allows accessing only
a single field. The `nomad_job_id` claim isn't sufficient to uniquely identify a
job because of namespaces. It's possible to create a JWT auth role with
`bound_claims` to avoid this becoming a security problem, but this doesn't allow
for correct accounting of user claims.

Add support for an `extra_claims` block on the server's `default_identity`
blocks for Vault. This allows a cluster administrator to add a custom claim on
all allocations. The values for these claims are interpolatable with a limited
subset of fields, similar to how we interpolate the task environment.

Fixes: https://github.com/hashicorp/nomad/issues/23510
Ref: https://hashicorp.atlassian.net/browse/NET-10372
Ref: https://hashicorp.atlassian.net/browse/NET-10387
2024-08-05 15:01:54 -04:00
Aimee Ukasick
cbacdb2041 DOCS: CE-659 chroot limitations for isolated fork/exec driver (#23739) 2024-08-05 14:35:54 -04:00
Tim Gross
0b9defe6e4 refactor identity claims constructor to builder (#23708)
Upcoming work to add extensibility to identity claims for Vault (ref #23510)
will require exposing server configuration and more objects from state to the
process of creating an `IdentityClaims` struct.

Depending on how we inject these parameters into the constructor, we end up
creating circular dependencies or a lot more logic in the setup in the plan
applier and alloc endpoint. There are three contexts where we call
`NewIdentityClaims`: the plan applier (where we only care about the default
identity), signing task identities, and signing service identities. Each needs
different parameters. So we'll refactor the constructor as a builder with
methods that the caller can decide to use (or not) depending on context. I've
pulled this work out of #23675 to make it easier to review separately.

Ref: https://github.com/hashicorp/nomad/issues/23510
Ref: https://github.com/hashicorp/nomad/pull/23675
Ref: https://hashicorp.atlassian.net/browse/NET-10372
Ref: https://hashicorp.atlassian.net/browse/NET-10387
2024-08-05 14:31:56 -04:00
Tim Gross
9ff7437b06 docs: document client.alloc_mounts_dir configuration (#23733)
In Nomad 1.8.0 we introduced the `alloc_mounts_dir` to support unveil filesystem
isolation, but we didn't document the configuration value.
2024-08-05 11:59:47 -04:00
dependabot[bot]
2c8ee29ade chore(deps): bump github.com/moby/term (#23587)
Bumps [github.com/moby/term](https://github.com/moby/term) from 0.0.0-20210619224110-3f7ff695adc6 to 0.5.0.
- [Commits](https://github.com/moby/term/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: github.com/moby/term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 11:47:23 -04:00
dependabot[bot]
1ba16f11ec chore(deps): bump github.com/containernetworking/cni from 1.1.2 to 1.2.3 (#23701)
Bumps [github.com/containernetworking/cni](https://github.com/containernetworking/cni) from 1.1.2 to 1.2.3.
- [Release notes](https://github.com/containernetworking/cni/releases)
- [Commits](https://github.com/containernetworking/cni/compare/v1.1.2...v1.2.3)

---
updated-dependencies:
- dependency-name: github.com/containernetworking/cni
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 11:45:57 -04:00
dependabot[bot]
8e6ccf38ff chore(deps): bump github.com/docker/docker (#23731)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.0.2+incompatible to 27.1.1+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v27.0.2...v27.1.1)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-05 11:41:54 -04:00