When accessing a region running a version of Nomad without node pools an
error was thrown because the request is handled by the nodes endpoint
which fails because it assumes `pools` is the node ID.
When a request is made to an RPC service that doesn't exist (for
example, a cross-region request from a newer version of Nomad to an
older version that doesn't implement the endpoint) the application
should return an empty list as well.
The upgrade path to Nomad 1.6.0 requires canonicalizing the namespace in
order to set the default scheduler configuration values.
Previous implementation only canonicalized on namespace upsert
operations, which works for recent namespaces as those Raft transactions
are reapplied on upgrade.
But for older namespaces restore from a snapshot the code path did not
canonicalize them, leaving the scheduler configuration set as `nil`.
The CSI specification says that we "SHOULD" send no more than one in-flight
request per *volume* at a time, with an allowance for losing state
(ex. leadership transitions) which the plugins "SHOULD" handle gracefully. We
mostly successfully serialize node and controller RPCs for the same volume,
except when Nomad clients are lost. (See also
https://github.com/container-storage-interface/spec/issues/512)
These concurrency requirements in the spec fall short because Storage Provider
APIs aren't necessarily safe to call concurrently on the same host even for
_different_ volumes. For example, concurrently attaching AWS EBS volumes to an
EC2 instance results in a race for device names, which results in failure to
attach (because the device name is taken already and the API call fails) and
confused results when releasing claims. So in practice many CSI plugins rely on
k8s-specific sidecars for serializing storage provider API calls globally. As a
result, we have to be much more conservative about concurrency in Nomad than the
spec allows.
This changeset includes four major changes to fix this:
* Add a serializer method to the CSI volume RPC handler. When the RPC handler
makes a destructive CSI Controller RPC, we send the RPC thru this serializer
and only one RPC is sent at a time. Any other RPCs in flight will block.
* Ensure that requests go to the same controller plugin instance whenever
possible by sorting by lowest client ID out of the plugin instances.
* Ensure that requests go to _healthy_ plugin instances only.
* Ensure that requests for controllers can go to a controller on any _live_
node, not just ones eligible for scheduling (which CSI controllers don't care
about)
Fixes: #15415
* Boot the user off the job if it gets deleted
* de-yoink
* watching the job watcher
* Unload record so history.back has to refire a (failing) request
* Acceptance tests for boot-out and notification
* e2e: add tests for using private registry with podman driver
This PR adds e2e tests that stands up a private docker registry
and has a podman tasks run a container from an image in that private
registry.
Tests
- user:password set in task config
- auth_soft_fail works for public images when auth is set in driver
- credentials helper is set in driver auth config
- config auth.json file is set in driver auth config
* packer: use nomad-driver-podman v0.5.0
* e2e: eliminate unnecessary chmod
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* cr: no need to install nomad twice
* cl: no need to install docker twice
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
Before this commit, it was only used for fingerprinting, but not
for CPU stats on nodes or tasks. This meant that if the
auto-detection failed, setting the cpu_total_compute didn't resolved
the issue.
This issue was most noticeable on ARM64, as there auto-detection
always failed.
ACL permissions for the search endpoints are done in three passes. The
first (the `sufficientSearchPerms` method) is for performance and coarsely
rejects requests based on the passed-in context parameter if the user has no
permissions to any object in that context. The second (the
`filteredSearchContexts` method) filters out contexts based on whether the user
has permissions either to the requested namespace or again by context (to catch
the "all" context). Finally, when iterating over the objects available, we do
the usual filtering in the iterator.
Internal testing found several bugs in this filtering:
* CSI plugins can be searched by any authenticated user.
* Variables can be searched if the user has `job:read` permissions to the
variable's namespace instead of `variable:list`.
* Variables cannot be searched by wildcard namespace.
This is an information leak of the plugin names and variable paths, which we
don't consider to be privileged information but intended to protect anyways.
This changeset fixes these bugs by ensuring CSI plugins are filtered in the 1st
and 2nd pass ACL filters, and changes variables to check `variable:list` in the
2nd pass filter unless the wildcard namespace is passed (at which point we'll
fallback to filtering in the iterator).
Fixes: CVE-2023-3300
Fixes: #17906
An ACL policy with a block without label generates unexpected results.
For example, a policy such as this:
```
namespace {
policy = "read"
}
```
Is applied to a namespace called `policy` instead of the documented
behaviour of applying it to the `default` namespace.
This happens because of the way HCL1 decodes blocks. Since it doesn't
know if a block is expected to have a label it applies the `key` tag to
the content of the block and, in the example above, the first key is
`policy`, so it sets that as the `namespace` block label.
Since this happens internally in the HCL decoder it's not possible to
detect the problem externally.
Fixing the problem inside the decoder is challenging because the JSON
and HCL parsers generate different ASTs that makes impossible to
differentiate between a JSON tree from an invalid HCL tree within the
decoder.
The fix in this commit consists of manually parsing the policy after
decoding to clear labels that were not set in the file. This allows the
validation rules to consistently catch and return any errors, no matter
if the policy is an invalid HCL or JSON.
This adds a quick smoke test of our binaries to verify we haven't exceeeded the
maximum GLIBC (2.17) version during linking which would break our ability to
execute on EL7 machines.
Cannot set a user for raw_exec tasks, because doing so does not work
with the 0700 root owned client data directory that we setup in the e2e
cluster in accordance with the Nomad hardening guide.