Commit Graph

24432 Commits

Author SHA1 Message Date
Tim Gross
aece7b061c E2E: fix events tests (#16595)
In #12916 we updated the events test as part of a larger set of changes around
mapstructure serialization fixes. But the changes to the jobs we're deploying in
the tests had invalid task configs so they never result in good deployments and
the test will always fail. Make the before/after jobs identical (except for the
version bump) and make them valid. Also wait for allocations for the 2nd job run
to appear before checking the deployment list, so that we don't race with the
scheduler.
2023-03-21 14:01:40 -07:00
Michael Schurter
a73a399162 Windows fixes for e2e tests (#16592)
* e2e: skip task api test when windows too old

* e2e: don't run proxy on windows
2023-03-21 13:55:32 -07:00
Suselz
5309325621 Update csi_plugin.mdx (#16584)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-21 16:16:18 +01:00
Tim Gross
a90df9dcad contrib: architecture guide to the drainer (#16569)
The drainer component is fairly complex. As part of upcoming work to fix some of
the drainer's rough edges, document the drainer's architecture from a Nomad
developer perspective.
2023-03-21 09:17:24 -04:00
Luiz Aoqui
a633b79fb5 changelog: update #16427 to improvement (#16565)
The security fix in Go 1.20.2 does not apply to Nomad.
2023-03-20 21:24:53 -04:00
Michael Schurter
fb085186b7 client/metadata: fix crasher caused by AllowStale = false (#16549)
Fixes #16517

Given a 3 Server cluster with at least 1 Client connected to Follower 1:

If a NodeMeta.{Apply,Read} for the Client request is received by
Follower 1 with `AllowStale = false` the Follower will forward the
request to the Leader.

The Leader, not being connected to the target Client, will forward the
RPC to Follower 1.

Follower 1, seeing AllowStale=false, will forward the request to the
Leader.

The Leader, not being connected to... well hoppefully you get the
picture: an infinite loop occurs.
2023-03-20 16:32:32 -07:00
Tim Gross
695df42d33 contrib: mock driver (#16573) 2023-03-20 16:35:32 -04:00
James Rasell
aacc7c6d21 dev: remove use of cfssl and use Nomad CLI for TLS certs. (#16145) 2023-03-20 17:06:15 +01:00
James Rasell
96740b5392 docs: remove Java and Scala SDKs from supported list. (#16555) 2023-03-20 15:35:02 +01:00
Phil Renaud
00718443f8 [ui] Perform common job tasks with keyboard shortcuts (#16378)
* Throw your mouse into traffic

* Add node metadata with a shortcut

* Re-labelled

* Adds a toast notification to job start/stop on keyboard shortcut

* Typo fix
2023-03-20 09:24:39 -04:00
Juana De La Cuesta
cc110f4cc7 Add -json flag to quota inspect command (#16478)
* Added  and  flag to  command

* cli[style]: small refactor to avoid confussion with tmpl variable

* Update inspect.mdx

* cli: add changelog entry

* Update .changelog/16478.txt

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update command/quota_inspect.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:40:51 +01:00
Juana De La Cuesta
26b4fcc2c9 cli: add -json and -t flags to quota status command (#16485)
* cli: add json and t flags to quota status command

* cli: add entry to changelog

* Update command/quota_status.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:56 +01:00
Juana De La Cuesta
151147b718 cli: Add json and -t flags to server members command (#16444)
* cli: Add  and  flags to server members

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* cli: update the server memebers tests to use must

* cli: add flags addition to changelog

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:24 +01:00
Adam Pugh
cd8615d9ec Spelling update (#16553)
updated propogating to propagating
2023-03-20 09:24:41 +01:00
Seth Hoenig
1cfa95ee54 tls enforcement flaky tests (#16543)
* tests: add WaitForLeaders helpers using must/wait timings

* tests: start servers for mtls tests together

Fixes #16253 (hopefully)
2023-03-17 14:11:13 -05:00
Piotr Kazmierczak
b95b105288 cli: nomad login command should not require a -type flag and should respect default auth method (#16504)
nomad login command does not need to know ACL Auth Method's type, since all
method names are unique. 

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-17 19:14:28 +01:00
Seth Hoenig
ed498f8ddb nsd: always set deregister flag after deregistration of group (#16289)
* services: always set deregister flag after deregistration of group

This PR fixes a bug where the group service hook's deregister flag was
not set in some cases, causing the hook to attempt deregistrations twice
during job updates (alloc replacement).

In the tests ... we used to assert on the wrong behvior (remove twice) which
has now been corrected to assert we remove only once.

This bug was "silent" in the Consul provider world because the error logs for
double deregistration only show up in Consul logs; with the Nomad provider the
error logs are in the Nomad agent logs.

* services: cleanup group service hook tests
2023-03-17 09:44:21 -05:00
Piotr Kazmierczak
76649dfe2f acl: fix canonicalization of OIDC auth method mock (#16534) 2023-03-17 15:37:54 +01:00
James Rasell
57a3cbe264 docs: add binding-rule selector escape example on Windows PS (#16273) 2023-03-17 15:13:35 +01:00
Michael Schurter
282e3bcfcc Enable ACLs on E2E test clients (#16530)
* e2e: uniformly enable acls across all agents

* docs: clarify that acls should be set everywhere
2023-03-16 14:22:41 -07:00
Tim Gross
8684183492 client: don't use Status RPC for Consul discovery (#16490)
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.
2023-03-16 15:38:33 -04:00
Seth Hoenig
995ab41cbc artifact: git needs more files for private repositories (#16508)
* landlock: git needs more files for private repositories

This PR fixes artifact downloading so that git may work when cloning from
private repositories. It needs

- file read on /etc/passwd
- dir read on /root/.ssh
- file write on /root/.ssh/known_hosts

Add these rules to the landlock rules for the artifact sandbox.

* cr: use nonexistent instead of devnull

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>

* cr: use go-homdir for looking up home directory

* pr: pull go-homedir into explicit require

* cr: fixup homedir tests in homeless root cases

* cl: fix root test for real

---------

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-03-16 12:22:25 -05:00
Michael Schurter
46ae1025bb docs: dispatch_payload and jobs api docs had some weirdness (#16514)
* docs: dispatch_payload docs had some weirdness

Docs said "Examples" when there was only 1 example. Not sure what the
floating "to" in the description was for.

* docs: missing a heading level on jobs api docs
2023-03-16 09:42:46 -07:00
Seth Hoenig
ea727dff9e artifact: do not set process attributes on darwin (#16511)
This PR fixes the non-root macOS use case where artifact downloads
stopped working. It seems setting a Credential on a SysProcAttr
used by the exec package will always cause fork/exec to fail -
even if the credential contains our own UID/GID or nil UID/GID.

Technically we do not need to set this as the child process will
inherit the parent UID/GID anyway... and not setting it makes
things work again ... /shrug
2023-03-16 11:31:18 -05:00
Seth Hoenig
098650e36c artifact: use specific version link for zipbomb artifact (#16513)
Fix the e2e case where we download the go-getter bomb.zip test file, which
is being removed on main. We can still get it from the version tag - yay git!
2023-03-16 10:18:46 -05:00
James Rasell
323abf731e build: fix test-nomad make target when running locally. (#16506) 2023-03-16 09:32:14 +01:00
Daniel Bennett
e4963b9c53 test: set BuildDate in default TestAgent config (#16499)
so enterprise tests don't fail due to the default zero time
2023-03-15 11:47:15 -05:00
James Rasell
bdf468c6ad cli: fix login help output formatting. (#16502) 2023-03-15 13:23:26 +01:00
Seth Hoenig
1a01e87192 scheduler: annotate tasksUpdated with reason and purge DeepEquals (#16421)
* scheduler: annotate tasksUpdated with reason and purge DeepEquals

* cr: move opaque into helper

* cr: swap affinity/spread hashing for slice equal

* contributing: update checklist-jobspec with notes about struct methods

* cr: add more cases to wait config equal method

* cr: use reflect when comparing envoy config blocks

* cl: add cl
2023-03-14 09:46:00 -05:00
Anthony
d5e013095d Merge pull request #16484 from hashicorp/tunzor-patch-1
Update for enterprise trial wording and link
2023-03-14 10:19:29 -04:00
Anthony
362f752b87 Updated trial license link and wording 2023-03-14 09:31:06 -04:00
Juana De La Cuesta
eaf22f2f68 cli: Add -json and -t flags to namespace status command (#16442)
* cli: Add  and  flag to namespace status command

* Update command/namespace_status.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* cli: update tests for namespace status command to use must

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-14 14:23:04 +01:00
Tim Gross
101e5d0225 docs: clarify migration behavior under nomad alloc stop (#16468) 2023-03-14 09:00:29 -04:00
Tim Gross
c70bbd14ba agent: trim space when parsing X-Nomad-Token header (#16469)
Our auth token parsing code trims space around the `Authorization` header but
not around `X-Nomad-Token`. When using the UI, it's easy to accidentally
introduce a leading or trailing space, which results in spurious authentication
errors. Trim the space at the HTTP server.
2023-03-14 08:57:53 -04:00
Seth Hoenig
a42a33fa6b cgv1: do not disable cpuset manager if reserved interface already exists (#16467)
* cgv1: do not disable cpuset manager if reserved interface already exists

This PR fixes a bug where restarting a Nomad Client on a machine using cgroups
v1 (e.g. Ubuntu 20.04) would cause the cpuset cgroups manager to disable itself.

This is being caused by incorrectly interpreting a "file exists" error as
problematic when ensuring the reserved cpuset exists. If we get a "file exists"
error, that just means the Client was likely restarted.

Note that a machine reboot would fix the issue - the groups interfaces are
ephemoral.

* cl: add cl
2023-03-13 17:00:17 -05:00
Luiz Aoqui
f2bfbfaf03 acl: update job eval requirement to submit-job (#16463)
The job evaluate endpoint creates a new evaluation for the job which is
a write operation. This change modifies the necessary capability from
`read-job` to `submit-job` to better reflect this.
2023-03-13 17:13:54 -04:00
Luiz Aoqui
b2c873274b plugin: add missing fields to TaskConfig (#16434) 2023-03-13 15:58:16 -04:00
Dao Thanh Tung
f3a527b26f doc: Update nomad fmt doc to run against non-deprecated HCL2 jobspec only (#16435)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-03-13 15:26:27 -04:00
Michael Schurter
5f37b2fba0 build: update from go1.20.1 to go1.20.2 (#16427)
* build: update from go1.20.1 to go1.20.2

Note that the CVE fixed in go1.20.2 does *not* impact Nomad.

https://github.com/golang/go/issues/58647
2023-03-13 09:47:07 -07:00
dependabot[bot]
5febe9bd9b build(deps): bump go.uber.org/goleak from 1.2.0 to 1.2.1 (#16439)
Bumps [go.uber.org/goleak](https://github.com/uber-go/goleak) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/uber-go/goleak/releases)
- [Changelog](https://github.com/uber-go/goleak/blob/master/CHANGELOG.md)
- [Commits](https://github.com/uber-go/goleak/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: go.uber.org/goleak
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-13 11:23:56 -05:00
Tim Gross
b6d6cc41ed scheduler: refactor system util tests (#16416)
The tests for the system allocs reconciling code path (`diffSystemAllocs`)
include many impossible test environments, such as passing allocs for the wrong
node into the function. This makes the test assertions nonsensible for use in
walking yourself through the correct behavior.

I've pulled this changeset out of PR #16097 so that we can merge these
improvements and revisit the right approach to fix the problem in #16097 with
less urgency now that the PFNR bug fix has been merged. This changeset breaks up
a couple of tests, expands test coverage, and makes test assertions more
clear. It also corrects one bit of production code that behaves fine in
production because of canonicalization, but forces us to remember to set values
in tests to compensate.
2023-03-13 11:59:31 -04:00
Seth Hoenig
12688f2ea3 scheduler: add simple benchmark for tasksUpdated (#16422)
In preperation for some refactoring to tasksUpdated, add a benchmark to the
old code so it's easy to compare with the changes, making sure nothing goes
off the rails for performance.
2023-03-13 10:44:14 -05:00
Seth Hoenig
a34925f689 deps: remove replace statement for go-discover (#16304)
Which we no longer need since we no longer have consul as a dependency
2023-03-13 10:40:35 -05:00
Tim Gross
2a0e45bd75 Merge pull request #16445 from hashicorp/post-1.5.1-release
Post 1.5.1 release
2023-03-13 11:29:49 -04:00
Tim Gross
172f49fc92 Merge release 1.5.1 files 2023-03-13 11:15:04 -04:00
hc-github-team-nomad-core
6c91cc85c7 Prepare for next release 2023-03-13 11:13:27 -04:00
hc-github-team-nomad-core
669495bac6 Generate files for 1.5.1 release 2023-03-13 11:13:27 -04:00
Tim Gross
d0ddd5e84b acl: prevent privilege escalation via workload identity
ACL policies can be associated with a job so that the job's Workload Identity
can have expanded access to other policy objects, including other
variables. Policies set on the variables the job automatically has access to
were ignored, but this includes policies with `deny` capabilities.

Additionally, when resolving claims for a workload identity without any attached
policies, the `ResolveClaims` method returned a `nil` ACL object, which is
treated similarly to a management token. While this was safe in Nomad 1.4.x,
when the workload identity token was exposed to the task via the `identity`
block, this allows a user with `submit-job` capabilities to escalate their
privileges.

We originally implemented automatic workload access to Variables as a separate
code path in the Variables RPC endpoint so that we don't have to generate
on-the-fly policies that blow up the ACL policy cache. This is fairly brittle
but also the behavior around wildcard paths in policies different from the rest
of our ACL polices, which is hard to reason about.

Add an `ACLClaim` parameter to the `AllowVariableOperation` method so that we
can push all this logic into the `acl` package and the behavior can be
consistent. This will allow a `deny` policy to override automatic access (and
probably speed up checks of non-automatic variable access).
2023-03-13 11:13:27 -04:00
Michael Schurter
9fefc18b77 e2e fixes: cli output, timing issue, and some cleanups (#16418)
* e2e: job expects alloc to run until stopped

* e2e: fix case changed by #16306

* e2e: couldn't find a bug but improved test+jobspecs
2023-03-10 13:14:51 -08:00
Luiz Aoqui
419c4bfa4e allocrunner: fix health check monitoring for Consul services (#16402)
Services must be interpolated to replace runtime variables before they
can be compared against the values returned by Consul.
2023-03-10 14:43:31 -05:00