On Windows, Nomad uses `syscall.NewLazyDLL` and `syscall.LoadDLL` functions to
load a few system DLL files, which does not prevent DLL hijacking
attacks. Hypothetically a local attacker on the client host that can place an
abusive library in a specific location could use this to escalate privileges to
the Nomad process. Although this attack does not fall within the Nomad security
model, it doesn't hurt to follow good practices here.
We can remove two of these DLL loads by using wrapper functions provided by the
stdlib in `x/sys/windows`
Co-authored-by: dduzgun-security <deniz.duzgun@hashicorp.com>
Support for fingerprinting the Consul admin partition was added in #19485. But
when the client fingerprints Consul CE, it gets a valid fingerprint and working
Consul but with a warn-level log. Return "ok" from the partition extractor, but
also ensure that we only add the Consul attribute if it actually has a value.
Fixes: https://github.com/hashicorp/nomad/issues/19756
When Nomad is configured with `verify_https_client=false` endpoints that
do not require an ACL token can be accessed without any other type of
authentication. Expand the docs to mention this effect.
The Nomad client renders templates in the same privileged process used for most
other client operations. During internal testing, we discovered that a malicious
task can create a symlink that can cause template rendering to read and write to
arbitrary files outside the allocation sandbox. Because the Nomad agent can be
restarted without restarting tasks, we can't simply check that the path is safe
at the time we write without encountering a time-of-check/time-of-use race.
To protect Nomad client hosts from this attack, we'll now read and write
templates in a subprocess:
* On Linux/Unix, this subprocess is sandboxed via chroot to the allocation
directory. This requires that Nomad is running as a privileged process. A
non-root Nomad agent will warn that it cannot sandbox the template renderer.
* On Windows, this process is sandboxed via a Windows AppContainer which has
been granted access to only to the allocation directory. This does not require
special privileges on Windows. (Creating symlinks in the first place can be
prevented by running workloads as non-Administrator or
non-ContainerAdministrator users.)
Both sandboxes cause encountered symlinks to be evaluated in the context of the
sandbox, which will result in a "file not found" or "access denied" error,
depending on the platform. This change will also require an update to
Consul-Template to allow callers to inject a custom `ReaderFunc` and
`RenderFunc`.
This design is intended as a workaround to allow us to fix this bug without
creating backwards compatibility issues for running tasks. A future version of
Nomad may introduce a read-only mount specifically for templates and artifacts
so that tasks cannot write into the same location that the Nomad agent is.
Fixes: https://github.com/hashicorp/nomad/issues/19888
Fixes: CVE-2024-1329
During allocation directory migration, the client was not checking that any
symlinks in the archive aren't pointing to somewhere outside the allocation
directory. While task driver sandboxing will protect against processes inside
the task from reading/writing thru the symlink, this doesn't protect against the
client itself from performing unintended operations outside the sandbox.
This changeset includes two changes:
* Update the archive unpacking to check the source of symlinks and require that
they fall within the sandbox.
* Fix a bug in the symlink check where it was using `filepath.Rel` which doesn't
work for paths in the sibling directories of the sandbox directory. This bug
doesn't appear to be exploitable but caused errors in testing.
Fixes: https://github.com/hashicorp/nomad/issues/19887
When filtering list results, the filter expression is applied to the
full object, not the stub. This is useful because it allows users to
filter the list on fields not present in the object stub. But it can
also be confusing because some fields have different names, or only
exist in the stub, so the filter expression needs to reference fields
not present in returned data.
Filtering on the stub would reduce the confusion, but it would also
restrict users to only be able to filter on the fields in the stub,
which, by definition, are just a subset of the original fields.
Documenting this behaviour can help users understand unexpected errors
and results.
The current implementation of the `nomad tls ca create` command
ovierrides the value of the `-domain` flag with `"nomad"` if no
additional customization is provided.
This results in a certificate for the wrong domain or an error if the
`-name-constraint` flag is also used.
THe logic for `IsCustom()` also seemed reversed. If all custom fields
are empty then the certificate is _not_ customized, so `IsCustom()`
should return false.
* HashiCorp Design System upgraded to 3.6.0
* Fresh yarn
* Responses out of range are brought back within
* General pass at a11y fixes with updated components and node
* Further tooltip updates
* 3 more partitions worth of toggle and tooltip updates
* scale-events-accordion and topo-viz node fixes
When a job eval is blocked due to missing capacity, the `nomad job run`
command will monitor the deployment, which may succeed once additional
capacity is made available.
But the current implementation would return `2` even when the deployment
succeeded because it only took the first eval status into account.
This commit updates the eval monitoring logic to reset the scheduling
error state if the deployment eventually succeeds.
Add new configuration option on task's volume_mounts, to give a fine grained control over SELinux "z" label
* Update website/content/docs/job-specification/volume_mount.mdx
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
* fix: typo
* func: make volume mount verification happen even on mounts with no volume
---------
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Although Nomad itself is not vulnerable to CVE-2024-21626, we want to update
dependencies that bring in the vulnerable packages so as not to trip
vulnerability scanners. Update `containerd` and `go-dockerclient` as well as the
various transitive dependencies these bring in.
We don't run the whole suite of unit tests on all platforms to keep CI times
reasonable, so the only things we've been running on Windows are
platform-specific.
I'm working on some platform-specific `template` related work and having these
tests run on Windows will reduce the risk of regressions. Our Windows CI box
doesn't have Consul or Vault, so I've skipped those tests for the time being,
and can follow up with that later. There's also a test with assertions looking
for specific paths, and the results are different on Windows. I've skipped those
for the moment as well and will follow up under a separate PR.
Also swap `testify` for `shoenig/test`
This PR refactors a helper function for getting the UID associated with
a given username to also return the GID and home directory. Also adds
unit tests on the known values of root and nobody user on Ubuntu Linux.
Some packages licensed under MPL-2.0 were incorrectly importing code
from packages licensed under BUSL-1.1.
Not all imports are fixed here as they will require additional work to
untangle them. To help track progress this commit adds a Semgrep rule
that detects incorrect BUSL-1.1 imports in MPL-2.0 packages.
Fixes#19781
Do not mark the envoy bootstrap hook as done after successfully running once.
Since the bootstrap file is written to /secrets, which is a tmpfs on supported
platforms, it is not persisted across reboots. This causes the task and
allocation to fail on reboot (see #19781).
This fixes it by *always* rewriting the envoy bootstrap file every time the
Nomad agent starts. This does mean we may write a new bootstrap file to an
already running Envoy task, but in my testing that doesn't have any impact.
This commit doesn't necessarily fix every use of Done by hooks, but hopefully
improves the situation. The comment on Done has been expanded to hopefully
avoid misuse in the future.
Done assertions were removed from tests as they add more noise than value.
*Alternative 1: Use a regular file*
An alternative approach would be to write the bootstrap file somewhere
other than the tmpfs, but this is *unsafe* as when Consul ACLs are
enabled the file will contain a secret token:
https://developer.hashicorp.com/consul/commands/connect/envoy#bootstrap
*Alternative 2: Detect if file is already written*
An alternative approach would be to detect if the bootstrap file exists,
and only write it if it doesn't.
This is just a more complicated form of the current fix. I think in
general in the absence of other factors task hooks should be idempotent
and therefore able to rerun on any agent startup. This simplifies the
code and our ability to reason about task restarts vs agent restarts vs
node reboots by making them all take the same code path.
Script checks don't support Consul's `success_before_passing`, `failures_before_critical`, or `failures_before_warning` because they're run by Nomad and not by Consul
Adds Namespace UI to Access Control - Also adds two step buttons to other Access Control pages
---------
Co-authored-by: Phil Renaud <phil@riotindustries.com>