Commit Graph

391 Commits

Author SHA1 Message Date
Deniz Onur Duzgun
1cc99cc1b4 bug: resolve type conversion alerts (#20553) 2024-05-15 13:22:10 -04:00
James Rasell
5041460043 core: do not create evaluations within batch deregister endpoint. (#20510)
The batch deregister RPC endpoint is only used by the internal
garbage collection process, it is not exposed via the HTTP API or
used anywhere else.

The GC process ensures that a job can only be removed from state
if all related evaluations and allocations are in a state that
means they can also be removed from state. This means that we do
not need to create evaluations when jobs are being deregistered
via this endpoint.
2024-05-07 07:39:13 +01:00
James Rasell
3f866a7e82 test: regenerate test TLS certificates. (#20511) 2024-05-02 13:58:32 +01:00
Daniel Bennett
ca1860ae76 state: enable more reverse sorting (#20410)
* mainly jobs endpoint
* update call sites
* add new sort helpers
* put sorting in a separate file
2024-04-16 15:10:11 -05:00
Seth Hoenig
ae6c4c8e3f deps: purge use of old x/exp packages (#20373) 2024-04-12 08:29:00 -05:00
Tim Gross
76009d89af tproxy: networking hook changes (#20183)
When `transparent_proxy` block is present and the network mode is `bridge`, use
a different CNI configuration that includes the `consul-cni` plugin. Before
invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the
allocation. This includes:

* Use all the `transparent_proxy` block fields
* The reserved ports are added to the inbound exclusion list so the alloc is
  reachable from outside the mesh
* The `expose` blocks and `check` blocks with `expose=true` are added to the
  inbound exclusion list so health checks work.

The `iptables.Config` is then passed as a CNI argument to the `consul-cni`
plugin.

Ref: https://github.com/hashicorp/nomad/issues/10628
2024-04-04 17:01:07 -04:00
Seth Hoenig
bd2a809135 subproc: lazy lookup nomad binary in self call (#20231) 2024-03-26 12:33:06 -05:00
Seth Hoenig
77889a16fb exec2: more tweaks to driver harness (#20221)
Also add an explicit exit code to subproc package for when a child
process is instructed to run an unrunnable command (i.e. cannot be
found or is not executable) - with the 127 return code folks using bash
are familiar with
2024-03-26 08:02:41 -05:00
Michael Schurter
23e4b7c9d2 Upgrade go-msgpack to v2 (#20173)
Replaces #18812

Upgraded with:
```
find . -name '*.go' -exec sed -i s/"github.com\/hashicorp\/go-msgpack\/codec"/"github.com\/hashicorp\/go-msgpack\/v2\/codec/" '{}' ';'
find . -name '*.go' -exec sed -i s/"github.com\/hashicorp\/net-rpc-msgpackrpc"/"github.com\/hashicorp\/net-rpc-msgpackrpc\/v2/" '{}' ';'
go get
go get -v -u github.com/hashicorp/raft-boltdb/v2
go get -v github.com/hashicorp/serf@5d32001edfaa18d1c010af65db707cdb38141e80
```

see https://github.com/hashicorp/go-msgpack/releases/tag/v2.1.0
for details
2024-03-21 11:44:23 -07:00
Seth Hoenig
286dce7a2a exec2: add a client.users configuration block (#20093)
* exec: add a client.users configuration block

For now just add min/max dynamic user values; soon we can also absorb
the "user.denylist" and "user.checked_drivers" options from the
deprecated client.options map.

* give the no-op pool implementation a better name

* use explicit error types to make referencing them cleaner in tests

* use import alias to not shadow package name
2024-03-08 16:02:32 -06:00
Seth Hoenig
67554b8f91 exec2: implement dynamic workload users taskrunner hook (#20069)
* exec2: implement dynamic workload users taskrunner hook

This PR impelements a TR hook for allocating dynamic workload users from
a pool managed by the Nomad client. This adds a new task driver Capability,
DynamicWorkloadUsers - which a task driver must indicate in order to make
use of this feature.

The client config plumbing is coming in a followup PR - in the RFC we
realized having a client.users block would be nice to have, with some
additional unrelated options being moved from the deprecated client.options
config.

* learn to spell
2024-03-06 09:34:27 -06:00
Seth Hoenig
57bd39061b exec2: implement a dynamic users pool (#20065)
* exec2: implement a dynamic users pool

This PR adds an implementation of a Pool from which dynamic users can
be allocated on behalf of tasks making use of an upcoming feature of
Nomad client (dynamic users).

A task hook and client plumbing, etc. will be in follow up PRs.

* no need for randomness assertion
2024-03-05 07:35:20 -06:00
James Rasell
4b46ff8ce0 test: fix test datarace within helper broker. (#19974) 2024-02-15 08:54:56 +00:00
Tim Gross
df86503349 template: sandbox template rendering
The Nomad client renders templates in the same privileged process used for most
other client operations. During internal testing, we discovered that a malicious
task can create a symlink that can cause template rendering to read and write to
arbitrary files outside the allocation sandbox. Because the Nomad agent can be
restarted without restarting tasks, we can't simply check that the path is safe
at the time we write without encountering a time-of-check/time-of-use race.

To protect Nomad client hosts from this attack, we'll now read and write
templates in a subprocess:

* On Linux/Unix, this subprocess is sandboxed via chroot to the allocation
  directory. This requires that Nomad is running as a privileged process. A
  non-root Nomad agent will warn that it cannot sandbox the template renderer.

* On Windows, this process is sandboxed via a Windows AppContainer which has
  been granted access to only to the allocation directory. This does not require
  special privileges on Windows. (Creating symlinks in the first place can be
  prevented by running workloads as non-Administrator or
  non-ContainerAdministrator users.)

Both sandboxes cause encountered symlinks to be evaluated in the context of the
sandbox, which will result in a "file not found" or "access denied" error,
depending on the platform. This change will also require an update to
Consul-Template to allow callers to inject a custom `ReaderFunc` and
`RenderFunc`.

This design is intended as a workaround to allow us to fix this bug without
creating backwards compatibility issues for running tasks. A future version of
Nomad may introduce a read-only mount specifically for templates and artifacts
so that tasks cannot write into the same location that the Nomad agent is.

Fixes: https://github.com/hashicorp/nomad/issues/19888
Fixes: CVE-2024-1329
2024-02-08 10:40:24 -05:00
Tim Gross
0d3cd1427f migration: check symlink sources during archive unpack
During allocation directory migration, the client was not checking that any
symlinks in the archive aren't pointing to somewhere outside the allocation
directory. While task driver sandboxing will protect against processes inside
the task from reading/writing thru the symlink, this doesn't protect against the
client itself from performing unintended operations outside the sandbox.

This changeset includes two changes:

* Update the archive unpacking to check the source of symlinks and require that
  they fall within the sandbox.
* Fix a bug in the symlink check where it was using `filepath.Rel` which doesn't
  work for paths in the sibling directories of the sandbox directory. This bug
  doesn't appear to be exploitable but caused errors in testing.

Fixes: https://github.com/hashicorp/nomad/issues/19887
2024-02-08 10:40:24 -05:00
Luiz Aoqui
ce710d49fd cli: fix tls ca create command with -domain (#19892)
The current implementation of the `nomad tls ca create` command
ovierrides the value of the `-domain` flag with `"nomad"` if no
additional customization is provided.

This results in a certificate for the wrong domain or an error if the
`-name-constraint` flag is also used.

THe logic for `IsCustom()` also seemed reversed. If all custom fields
are empty then the certificate is _not_ customized, so `IsCustom()`
should return false.
2024-02-07 16:40:51 -05:00
Seth Hoenig
b50b81e488 users: refactor method for getting UID from username (#19840)
This PR refactors a helper function for getting the UID associated with
a given username to also return the GID and home directory. Also adds
unit tests on the known values of root and nobody user on Ubuntu Linux.
2024-01-29 13:56:30 -06:00
Luiz Aoqui
41277f823f license: fix some imports of BUSL-1.1 in MPL-2.0 (#19832)
Some packages licensed under MPL-2.0 were incorrectly importing code
from packages licensed under BUSL-1.1.

Not all imports are fixed here as they will require additional work to
untangle them. To help track progress this commit adds a Semgrep rule
that detects incorrect BUSL-1.1 imports in MPL-2.0 packages.
2024-01-29 12:04:12 -05:00
James Rasell
ff2d0d6453 cli: Fix dummy FSM create to ensure snapshot state command works. (#19630)
The Nomad state store function was recently updated to validate
certain parameters, fixing a panic condition. This change meant
dummy FSM used for the snapshot state command was always failing
this validation and the command no longer worked.

This change adds the required parameter to pass validation and
therefore makes the CLI command functional again.
2024-01-05 16:00:24 +00:00
Marvin Chin
be8575a8a2 Fix server shutdown not waiting for worker run completion (#19560)
* Move group into a separate helper module for reuse

* Add shutdownCh to worker

The shutdown channel is used to signal that worker has stopped.

* Make server shutdown block on workers' shutdownCh

* Fix waiting for eval broker state change blocking indefinitely

There was a race condition in the GenericNotifier between the
Run and WaitForChange functions, where WaitForChange blocks
trying to write to a full unsubscribeCh, but the Run function never
reads from the unsubscribeCh as it has already stopped.

This commit fixes it by unblocking if the notifier has been stopped.

* Bound the amount of time server shutdown waits on worker completion

* Fix lostcancel linter error

* Fix worker test using unexpected worker constructor

* Add changelog

---------

Co-authored-by: Marvin Chin <marvinchin@users.noreply.github.com>
2024-01-05 08:45:07 -06:00
Luiz Aoqui
e0cea41e37 client: deprecate loading plugins without config (#19189)
Nomad load all plugins from `plugin_dir` regardless if it is listed in
the agent configuration file. This can cause unexpected binaries to be
executed.

This commit begins the deprecation process of this behaviour. The Nomad
agent will emit a warning log for every plugin binary found without a
corresponding agent configuration block.

---------

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-11-27 21:36:42 -05:00
Tim Gross
9d075c44b2 config: remove old Vault/Consul config blocks from parser (#18997)
Remove the now-unused original configuration blocks for Consul and Vault from
the agent configuration parsing. When the agent needs to refer to a Consul or
Vault block it will always be for a specific cluster for the task/service (or
the default cluster for the agent's own use).

This is third of three changesets for this work.

Fixes: https://github.com/hashicorp/nomad/issues/18947
Ref: https://github.com/hashicorp/nomad/pull/18991
Ref: https://github.com/hashicorp/nomad/pull/18994
2023-11-08 09:30:08 -05:00
Kerim Satirli
5e1bbf90fc docs: update all URLs to developer.hashicorp.com (#16247) 2023-10-24 11:00:11 -04:00
Seth Hoenig
e3c8700ded deps: upgrade to go-set/v2 (#18638)
No functional changes, just cleaning up deprecated usages that are
removed in v2 and replace one call of .Slice with .ForEach to avoid
making the intermediate copy.
2023-10-05 11:56:17 -05:00
Tim Gross
5bd8b89c19 helper: reduce size of buffer used by template connections (#18524)
In #12458 we added an in-memory connection buffer so that template runners that
want access to the Nomad API for Service Registration and Variables can
communicate with Nomad without having to create a real HTTP client. The size of
this buffer (1 MiB) was taken directly from its usage in Vault, and each
connection makes 2 such buffers (send and receive). Because each template runner
has its own connection, when there are large numbers of allocations this adds up
to significant memory usage.

The largest Nomad Variable payload is 64KiB, and a small amount of
metadata. Service Registration responses are much smaller, and we don't include
check results in them (as Consul does), so the size is relatively bounded. We
should be able to safely reduce the size of the buffer by a factor of 10 or more
without forcing the template runner to make multiple read calls over the buffer.

Fixes: #18508
2023-09-18 09:12:09 -04:00
Tim Gross
756a22f4d5 lint: fix a missing gofmt -s (#18480) 2023-09-13 10:22:19 -04:00
Shantanu Gadgil
12580c345a bubble up the error message from go-getter (#18444) 2023-09-13 09:36:39 -04:00
Michael Schurter
ef24e40b39 identity: support jwt expiration and rotation (#18262)
Implements expirations and renewals for alternate workload identity tokens.
2023-09-08 14:50:34 -07:00
James Rasell
776a26bce7 raft: remove use of deprecated Leader func. (#18352)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-09-01 10:01:34 +01:00
Seth Hoenig
f5b0da1d55 all: swap exp packages for maps, slices (#18311) 2023-08-23 15:42:13 -05:00
James Rasell
6108f5c4c3 admin: rename _oss files to _ce (#18209) 2023-08-18 07:47:24 +01:00
Tim Gross
f00bff09f1 fix multiple overflow errors in exponential backoff (#18200)
We use capped exponential backoff in several places in the code when handling
failures. The code we've copy-and-pasted all over has a check to see if the
backoff is greater than the limit, but this check happens after the bitshift and
we always increment the number of attempts. This causes an overflow with a
fairly small number of failures (ex. at one place I tested it occurs after only
24 iterations), resulting in a negative backoff which then never recovers. The
backoff becomes a tight loop consuming resources and/or DoS'ing a Nomad RPC
handler or an external API such as Vault. Note this doesn't occur in places
where we cap the number of iterations so the loop breaks (usually to return an
error), so long as the number of iterations is reasonable.

Introduce a helper with a check on the cap before the bitshift to avoid overflow in all 
places this can occur.

Fixes: #18199
Co-authored-by: stswidwinski <stan.swidwinski@gmail.com>
2023-08-15 14:38:18 -04:00
Michael Schurter
0e22fc1a0b identity: add support for multiple identities + audiences (#18123)
Allows for multiple `identity{}` blocks for tasks along with user-specified audiences. This is a building block to allow workload identities to be used with Consul, Vault and 3rd party JWT based auth methods.

Expiration is still unimplemented and is necessary for JWTs to be used securely, so that's up next.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-08-15 09:11:53 -07:00
Seth Hoenig
d9341f0664 update go1.21 (#18184)
* build: update to go1.21

* go: eliminate helpers in favor of min/max

* build: run go mod tidy

* build: swap depguard for semgrep

* command: fixup broken tls error check on go1.21
2023-08-14 08:43:27 -05:00
hashicorp-copywrite[bot]
2d35e32ec9 Update copyright file headers to BUSL-1.1 2023-08-10 17:27:15 -05:00
Seth Hoenig
a4cc76bd3e numa: enable numa topology detection (#18146)
* client: refactor cgroups management in client

* client: fingerprint numa topology

* client: plumb numa and cgroups changes to drivers

* client: cleanup task resource accounting

* client: numa client and config plumbing

* lib: add a stack implementation

* tools: remove ec2info tool

* plugins: fixup testing for cgroups / numa changes

* build: update makefile and package tests and cl
2023-08-10 17:05:30 -05:00
Michael Schurter
d14362ec19 core: add jwks rpc and http api (#18035)
Add JWKS endpoint to HTTP API for exposing the root public signing keys used for signing workload identity JWTs.

Part 1 of N components as part of making workload identities consumable by third party services such as Consul and Vault. Identity attenuation (audience) and expiration (+renewal) are necessary to securely use workload identities with 3rd parties, so this merge does not yet document this endpoint.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-07-27 11:27:17 -07:00
Lance Haig
03cde51720 Rename Function to reflect correct outcome. (#17948) 2023-07-24 10:43:51 +01:00
Patric Stout
e190eae395 Use config "cpu_total_compute" (if set) for all CPU statistics (#17628)
Before this commit, it was only used for fingerprinting, but not
for CPU stats on nodes or tasks. This meant that if the
auto-detection failed, setting the cpu_total_compute didn't resolved
the issue.

This issue was most noticeable on ARM64, as there auto-detection
always failed.
2023-07-19 13:30:47 -05:00
Lance Haig
1541358ef3 Add the ability to customise the details of the CA (#17309)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-07-11 08:53:09 +01:00
Tim Gross
0ba7d0036b CSI: persist previous mounts on client to restore during restart (#17840)
When claiming a CSI volume, we need to ensure the CSI node plugin is running
before we send any CSI RPCs. This extends even to the controller publish RPC
because it requires the storage provider's "external node ID" for the
client. This primarily impacts client restarts but also is a problem if the node
plugin exits (and fingerprints) while the allocation that needs a CSI volume
claim is being placed.

Unfortunately there's no mapping of volume to plugin ID available in the
jobspec, so we don't have enough information to wait on plugins until we either
get the volume from the server or retrieve the plugin ID from data we've
persisted on the client.

If we always require getting the volume from the server before making the claim,
a client restart for disconnected clients will cause all the allocations that
need CSI volumes to fail. Even while connected, checking in with the server to
verify the volume's plugin before trying to make a claim RPC is inherently racy,
so we'll leave that case as-is and it will fail the claim if the node plugin
needed to support a newly-placed allocation is flapping such that the node
fingerprint is changing.

This changeset persists a minimum subset of data about the volume and its plugin
in the client state DB, and retrieves that data during the CSI hook's prerun to
avoid re-claiming and remounting the volume unnecessarily.

This changeset also updates the RPC handler to use the external node ID from the
claim whenever it is available.

Fixes: #13028
2023-07-10 13:20:15 -04:00
Luiz Aoqui
f4c7182873 node pools: apply node pool scheduler configuration (#17598) 2023-06-21 20:31:50 -04:00
Lance Haig
7e93f150b5 cli: tls certs not created with correct SANs (#16959)
The `nomad tls cert` command did not create certificates with the correct SANs for
them to work with non default domain and region names. This changset updates the
code to support non default domains and regions in the certificates.
2023-05-22 09:31:56 -04:00
Michael Schurter
fea29c57a4 remove unused helper/fields package (#17197)
Has been unused since we switched task drivers to plugins and used
hclschema for config in #4936 (v0.9.0-beta1)
2023-05-15 12:10:11 -07:00
Luiz Aoqui
1e679ccba2 node pool: initial base work (#17163)
Implementation of the base work for the new node pools feature. It includes a new `NodePool` struct and its corresponding state store table.

Upon start the state store is populated with two built-in node pools that cannot be modified nor deleted:

  * `all` is a node pool that always includes all nodes in the cluster.
  * `default` is the node pool where nodes that don't specify a node pool in their configuration are placed.
2023-05-15 10:49:08 -04:00
James Rasell
f7a8cf55f8 test: fix flakey broker notifier test. (#16994) 2023-05-10 13:40:25 +01:00
Seth Hoenig
c89f6a538e connect: remove unusable path for fallback envoy image names (#17044)
This PR does some cleanup of an old code path for versions of Consul that
did not support reporting the supported versions of Envoy in its API. Those
versions are no longer supported for years at this point, and the fallback
version of envoy hasn't been supported by any version of Consul for almost
as long. Remove this code path that is no longer useful.
2023-05-02 09:48:44 -05:00
Seth Hoenig
7744caed48 connect: use explicit docker.io prefix in default envoy image names (#17045)
This PR modifies references to the envoyproxy/envoy docker image to
explicitly include the docker.io prefix. This does not affect existing
users, but makes things easier for Podman users, who otherwise need to
specify the full name because Podman does not default to docker.io
2023-05-02 09:27:48 -05:00
Luiz Aoqui
ee5a08dbb2 Revert "hashicorp/go-msgpack v2 (#16810)" (#17047)
This reverts commit 8a98520d56.
2023-05-01 17:18:34 -04:00
Ian Fijolek
8a98520d56 hashicorp/go-msgpack v2 (#16810)
* Upgrade from hashicorp/go-msgpack v1.1.5 to v2.1.0

Fixes #16808

* Update hashicorp/net-rpc-msgpackrpc to v2 to match go-msgpack

* deps: use go-msgpack v2.0.0

go-msgpack v2.1.0 includes some code changes that we will need to
investigate furthere to assess its impact on Nomad, so keeping this
dependency on v2.0.0 for now since it's no-op.

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-17 17:02:05 -04:00