Commit Graph

25594 Commits

Author SHA1 Message Date
Tim Gross
110d93ab25 windows: remove LazyDLL calls for system modules (#19925)
On Windows, Nomad uses `syscall.NewLazyDLL` and `syscall.LoadDLL` functions to
load a few system DLL files, which does not prevent DLL hijacking
attacks. Hypothetically a local attacker on the client host that can place an
abusive library in a specific location could use this to escalate privileges to
the Nomad process. Although this attack does not fall within the Nomad security
model, it doesn't hurt to follow good practices here.

We can remove two of these DLL loads by using wrapper functions provided by the
stdlib in `x/sys/windows`

Co-authored-by: dduzgun-security <deniz.duzgun@hashicorp.com>
2024-02-09 08:47:48 -05:00
Tim Gross
62c57d208b fingerprint: eliminate spurious warning logs with Consul CE (#19923)
Support for fingerprinting the Consul admin partition was added in #19485. But
when the client fingerprints Consul CE, it gets a valid fingerprint and working
Consul but with a warn-level log. Return "ok" from the partition extractor, but
also ensure that we only add the Consul attribute if it actually has a value.

Fixes: https://github.com/hashicorp/nomad/issues/19756
2024-02-09 08:19:00 -05:00
Phil Renaud
81f868631f Fix vercel deployments EBADENGINE errors (#19914) 2024-02-08 14:14:57 -05:00
Phil Renaud
41c783aec2 Noting action name restrictions, and correcting those of auth methods and roles (#19905) 2024-02-08 12:01:22 -05:00
Tim Gross
fc26e0cb22 Post 1.7.4 release (#19918) 2024-02-08 10:58:50 -05:00
Luiz Aoqui
2a348ba714 docs: expand impact of verify_https_client=false (#19916)
When Nomad is configured with `verify_https_client=false` endpoints that
do not require an ACL token can be accessed without any other type of
authentication. Expand the docs to mention this effect.
2024-02-08 10:55:40 -05:00
Tim Gross
2970690355 Merge release 1.7.4 files 2024-02-08 10:41:11 -05:00
hc-github-team-nomad-core
33f0a5b268 Prepare for next release 2024-02-08 10:40:24 -05:00
hc-github-team-nomad-core
875e96cccc Generate files for 1.7.4 release 2024-02-08 10:40:24 -05:00
Tim Gross
df86503349 template: sandbox template rendering
The Nomad client renders templates in the same privileged process used for most
other client operations. During internal testing, we discovered that a malicious
task can create a symlink that can cause template rendering to read and write to
arbitrary files outside the allocation sandbox. Because the Nomad agent can be
restarted without restarting tasks, we can't simply check that the path is safe
at the time we write without encountering a time-of-check/time-of-use race.

To protect Nomad client hosts from this attack, we'll now read and write
templates in a subprocess:

* On Linux/Unix, this subprocess is sandboxed via chroot to the allocation
  directory. This requires that Nomad is running as a privileged process. A
  non-root Nomad agent will warn that it cannot sandbox the template renderer.

* On Windows, this process is sandboxed via a Windows AppContainer which has
  been granted access to only to the allocation directory. This does not require
  special privileges on Windows. (Creating symlinks in the first place can be
  prevented by running workloads as non-Administrator or
  non-ContainerAdministrator users.)

Both sandboxes cause encountered symlinks to be evaluated in the context of the
sandbox, which will result in a "file not found" or "access denied" error,
depending on the platform. This change will also require an update to
Consul-Template to allow callers to inject a custom `ReaderFunc` and
`RenderFunc`.

This design is intended as a workaround to allow us to fix this bug without
creating backwards compatibility issues for running tasks. A future version of
Nomad may introduce a read-only mount specifically for templates and artifacts
so that tasks cannot write into the same location that the Nomad agent is.

Fixes: https://github.com/hashicorp/nomad/issues/19888
Fixes: CVE-2024-1329
2024-02-08 10:40:24 -05:00
Tim Gross
0d3cd1427f migration: check symlink sources during archive unpack
During allocation directory migration, the client was not checking that any
symlinks in the archive aren't pointing to somewhere outside the allocation
directory. While task driver sandboxing will protect against processes inside
the task from reading/writing thru the symlink, this doesn't protect against the
client itself from performing unintended operations outside the sandbox.

This changeset includes two changes:

* Update the archive unpacking to check the source of symlinks and require that
  they fall within the sandbox.
* Fix a bug in the symlink check where it was using `filepath.Rel` which doesn't
  work for paths in the sibling directories of the sandbox directory. This bug
  doesn't appear to be exploitable but caused errors in testing.

Fixes: https://github.com/hashicorp/nomad/issues/19887
2024-02-08 10:40:24 -05:00
hc-github-team-nomad-core
c03c735c99 Backport of deps: update dependencies indirectly bringing in older runc into release/1.7.x #19866
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-02-08 10:40:24 -05:00
hc-github-team-nomad-core
af7cf79df7 Backport of chore(deps): bump github.com/opencontainers/runc from 1.1.10 to 1.1.12 into release/1.7.x #19862
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-08 10:40:24 -05:00
Luiz Aoqui
7391a59695 docs: add note about stub list filtering (#19902)
When filtering list results, the filter expression is applied to the
full object, not the stub. This is useful because it allows users to
filter the list on fields not present in the object stub. But it can
also be confusing because some fields have different names, or only
exist in the stub, so the filter expression needs to reference fields
not present in returned data.

Filtering on the stub would reduce the confusion, but it would also
restrict users to only be able to filter on the fields in the stub,
which, by definition, are just a subset of the original fields.

Documenting this behaviour can help users understand unexpected errors
and results.
2024-02-07 16:41:07 -05:00
Luiz Aoqui
ce710d49fd cli: fix tls ca create command with -domain (#19892)
The current implementation of the `nomad tls ca create` command
ovierrides the value of the `-domain` flag with `"nomad"` if no
additional customization is provided.

This results in a certificate for the wrong domain or an error if the
`-name-constraint` flag is also used.

THe logic for `IsCustom()` also seemed reversed. If all custom fields
are empty then the certificate is _not_ customized, so `IsCustom()`
should return false.
2024-02-07 16:40:51 -05:00
Phil Renaud
15b06e8505 [ui] HashiCorp Design System upgraded to 3.6.0 (#19872)
* HashiCorp Design System upgraded to 3.6.0

* Fresh yarn

* Responses out of range are brought back within

* General pass at a11y fixes with updated components and node

* Further tooltip updates

* 3 more partitions worth of toggle and tooltip updates

* scale-events-accordion and topo-viz node fixes
2024-02-07 16:08:41 -05:00
Kiara Grouwstra
1e04fc4613 Libraries & SDKs: add nix-nomad (#19808) 2024-02-06 20:47:23 -05:00
Luiz Aoqui
7daa854491 docs: remove duplicate entry for upstreams.config (#19877) 2024-02-06 20:44:02 -05:00
Luiz Aoqui
5825cefe51 docs: remove Docker cpuset_cpus config (#19882)
Nomad 1.7 refactored how CPU cores are assigned to tasks, making the
Docker-specific `cpuset_cpus` configuration no longer used.
2024-02-06 10:51:16 -05:00
Phil Renaud
c927377700 Random exec assignment depends on taskGroup name if provided (#19878) 2024-02-05 23:23:01 -05:00
Luiz Aoqui
50c50a6328 cli: fix return code when job deployment succeeds (#19876)
When a job eval is blocked due to missing capacity, the `nomad job run`
command will monitor the deployment, which may succeed once additional
capacity is made available.

But the current implementation would return `2` even when the deployment
succeeded because it only took the first eval status into account.

This commit updates the eval monitoring logic to reset the scheduling
error state if the deployment eventually succeeds.
2024-02-05 18:32:25 -05:00
Juana De La Cuesta
120c3ca3c9 Add granular control of SELinux labels for host mounts (#19839)
Add new configuration option on task's volume_mounts, to give a fine grained control over SELinux "z" label

* Update website/content/docs/job-specification/volume_mount.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* fix: typo

* func: make volume mount verification happen even on  mounts with no volume

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-02-05 10:05:33 +01:00
Tim Gross
f1637bdd5f deps: update dependencies indirectly bringing in older runc (#19863)
Although Nomad itself is not vulnerable to CVE-2024-21626, we want to update
dependencies that bring in the vulnerable packages so as not to trip
vulnerability scanners. Update `containerd` and `go-dockerclient` as well as the
various transitive dependencies these bring in.
2024-02-02 16:08:22 -05:00
dependabot[bot]
b94a193c8a chore(deps): bump github.com/opencontainers/runc from 1.1.10 to 1.1.12 (#19851)
* chore(deps): bump github.com/opencontainers/runc from 1.1.10 to 1.1.12

Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.10 to 1.1.12.
- [Release notes](https://github.com/opencontainers/runc/releases)
- [Changelog](https://github.com/opencontainers/runc/blob/v1.1.12/CHANGELOG.md)
- [Commits](https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.12)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* add changelog entry

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-02-02 10:18:53 -05:00
Tim Gross
334c383eb6 template: run template tests on Windows where possible (#19856)
We don't run the whole suite of unit tests on all platforms to keep CI times
reasonable, so the only things we've been running on Windows are
platform-specific.

I'm working on some platform-specific `template` related work and having these
tests run on Windows will reduce the risk of regressions. Our Windows CI box
doesn't have Consul or Vault, so I've skipped those tests for the time being,
and can follow up with that later. There's also a test with assertions looking
for specific paths, and the results are different on Windows. I've skipped those
for the moment as well and will follow up under a separate PR.

Also swap `testify` for `shoenig/test`
2024-02-02 09:22:03 -05:00
Heat Hamilton
556d44cd7a Merge pull request #19848 from hashicorp/heat/chore/update-website-dependencies
website: update dependencies
2024-01-30 15:07:11 -05:00
Heat Hamilton
0b29a7d727 Update dependencies to match Next v14 in Dev Portal; updated husky workflow to v9; updated nvmrc to v18 2024-01-30 13:43:36 -05:00
Daniel Bennett
e059adef98 e2e: PreCleanup and other jobs3 helpers (#19844) 2024-01-29 17:54:54 -06:00
Seth Hoenig
b50b81e488 users: refactor method for getting UID from username (#19840)
This PR refactors a helper function for getting the UID associated with
a given username to also return the GID and home directory. Also adds
unit tests on the known values of root and nobody user on Ubuntu Linux.
2024-01-29 13:56:30 -06:00
Luiz Aoqui
41277f823f license: fix some imports of BUSL-1.1 in MPL-2.0 (#19832)
Some packages licensed under MPL-2.0 were incorrectly importing code
from packages licensed under BUSL-1.1.

Not all imports are fixed here as they will require additional work to
untangle them. To help track progress this commit adds a Semgrep rule
that detects incorrect BUSL-1.1 imports in MPL-2.0 packages.
2024-01-29 12:04:12 -05:00
James Rasell
10324566ae driver/rawexec: populate OOM killed exit result. (#19829) 2024-01-29 08:54:52 +00:00
James Rasell
8d6067e987 driver/qemu: populate OOM killed exit result. (#19830) 2024-01-29 07:34:27 +00:00
James Rasell
34fe96a420 driver/java: populate OOM killed exit result. (#19818) 2024-01-26 08:09:16 +00:00
James Rasell
9e6f12ef2d stream: remove unused internal error definition from event stream. (#19819) 2024-01-26 07:52:53 +00:00
Michael Schurter
a283a41613 docs: mention wildcards in namespace api docs (#19809)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2024-01-24 11:52:28 -08:00
Michael Schurter
8f564182ef connect: rewrite envoy bootstrap on every restart (#19787)
Fixes #19781

Do not mark the envoy bootstrap hook as done after successfully running once.
Since the bootstrap file is written to /secrets, which is a tmpfs on supported
platforms, it is not persisted across reboots. This causes the task and
allocation to fail on reboot (see #19781).

This fixes it by *always* rewriting the envoy bootstrap file every time the
Nomad agent starts. This does mean we may write a new bootstrap file to an
already running Envoy task, but in my testing that doesn't have any impact.

This commit doesn't necessarily fix every use of Done by hooks, but hopefully
improves the situation. The comment on Done has been expanded to hopefully
avoid misuse in the future.

Done assertions were removed from tests as they add more noise than value.

*Alternative 1: Use a regular file*

An alternative approach would be to write the bootstrap file somewhere
other than the tmpfs, but this is *unsafe* as when Consul ACLs are
enabled the file will contain a secret token:
https://developer.hashicorp.com/consul/commands/connect/envoy#bootstrap

*Alternative 2: Detect if file is already written*

An alternative approach would be to detect if the bootstrap file exists,
and only write it if it doesn't.

This is just a more complicated form of the current fix. I think in
general in the absence of other factors task hooks should be idempotent
and therefore able to rerun on any agent startup. This simplifies the
code and our ability to reason about task restarts vs agent restarts vs
node reboots by making them all take the same code path.
2024-01-24 11:26:31 -08:00
Piotr Kazmierczak
543ba16e61 e2e: more retries for RequireConsulDeregistered (#19801) 2024-01-22 20:11:48 +01:00
Luiz Aoqui
b7fa4447bd docs: autoscaler config for blocking query timeout (#19777) 2024-01-22 13:08:10 -05:00
Piotr Kazmierczak
8a4bd61caf e2e: WaitForJobStopped correction (#19749) 2024-01-22 11:38:22 +01:00
dependabot[bot]
af2cdc98a5 chore(deps): bump golang.org/x/sync from 0.4.0 to 0.6.0 (#19792) 2024-01-22 07:32:21 +00:00
Adrian Todorov
044eb0e048 docs: warnings about template dependencies, HCL2 clarifications (#19779) 2024-01-19 14:07:15 -05:00
Luiz Aoqui
fce30f342c docs: add lock_namespace autoscaler config (#19769)
Document the `high_availability.lock_namespace` configuration of the
Nomad Autoscaler.
2024-01-18 11:52:14 -05:00
Vijesh
3b4afea974 docs: note script checks don't support some Consul options (#19770)
Script checks don't support Consul's `success_before_passing`, `failures_before_critical`, or `failures_before_warning` because they're run by Nomad and not by Consul
2024-01-18 08:38:57 -05:00
Piotr Kazmierczak
8f99ba6b2c docs: add missing JWT auth method API documentation (#19757) 2024-01-17 16:03:08 +01:00
Tom Davies
5a11a28cac docs: updates link to Consul WLI migration docs (#19748) 2024-01-17 09:57:02 -05:00
Piotr Kazmierczak
11ca21ca3c cli: correct typos in setup consul (#19754) 2024-01-17 14:13:07 +01:00
James Rasell
41555b6370 cli: Fix minor help formatting issue in agent command. (#19743) 2024-01-17 12:18:00 +00:00
Mike Nomitch
bc039a7a8a Adds Namespace UI to Access Control (#19402)
Adds Namespace UI to Access Control - Also adds two step buttons to other Access Control pages

---------

Co-authored-by: Phil Renaud <phil@riotindustries.com>
2024-01-16 09:20:50 -08:00
Luiz Aoqui
c0cfeb3ecd Merge pull request #19746 from hashicorp/post-1.7.3-release
Post 1.7.3 release
2024-01-16 10:59:02 -05:00
Luiz Aoqui
051202087b Merge release 1.7.3 files 2024-01-15 16:00:12 -05:00