Commit Graph

24894 Commits

Author SHA1 Message Date
Luiz Aoqui
969ea54628 ui: fix Topology node state filter (#17940)
"Ineligible" and "Draining" are not determined by the node status, but
are rather inferred from other fields.
2023-07-19 16:38:20 -04:00
Nando
ca26673781 volume-status : show namespace the volume belongs to (#17911)
* volume-status : show namespace the volume belongs to
2023-07-19 16:36:51 -04:00
Luiz Aoqui
bd3ef90f8f changelog: add entry for #17731 into 1.4.11 (#17994) 2023-07-19 16:24:39 -04:00
Patric Stout
e190eae395 Use config "cpu_total_compute" (if set) for all CPU statistics (#17628)
Before this commit, it was only used for fingerprinting, but not
for CPU stats on nodes or tasks. This meant that if the
auto-detection failed, setting the cpu_total_compute didn't resolved
the issue.

This issue was most noticeable on ARM64, as there auto-detection
always failed.
2023-07-19 13:30:47 -05:00
louievandyke
0d343f269a docs: updating to specify mTLS rpc endpoints (#17963) 2023-07-19 14:16:35 -04:00
Luiz Aoqui
a04245d9a6 Merge pull request #17986 from hashicorp/post-1.6.0-release
Post 1.6.0 release
2023-07-19 10:55:04 -04:00
Luiz Aoqui
47fb70bbc2 Merge release 1.6.0 files 2023-07-19 10:45:20 -04:00
hc-github-team-nomad-core
bc8b4bd749 Prepare for next release 2023-07-19 10:38:08 -04:00
hc-github-team-nomad-core
573cab2b1d Generate files for 1.6.0 release 2023-07-19 10:38:08 -04:00
Tim Gross
a8789d3872 search: fix ACL filtering for plugins and variables
ACL permissions for the search endpoints are done in three passes. The
first (the `sufficientSearchPerms` method) is for performance and coarsely
rejects requests based on the passed-in context parameter if the user has no
permissions to any object in that context. The second (the
`filteredSearchContexts` method) filters out contexts based on whether the user
has permissions either to the requested namespace or again by context (to catch
the "all" context). Finally, when iterating over the objects available, we do
the usual filtering in the iterator.

Internal testing found several bugs in this filtering:
* CSI plugins can be searched by any authenticated user.
* Variables can be searched if the user has `job:read` permissions to the
  variable's namespace instead of `variable:list`.
* Variables cannot be searched by wildcard namespace.

This is an information leak of the plugin names and variable paths, which we
don't consider to be privileged information but intended to protect anyways.

This changeset fixes these bugs by ensuring CSI plugins are filtered in the 1st
and 2nd pass ACL filters, and changes variables to check `variable:list` in the
2nd pass filter unless the wildcard namespace is passed (at which point we'll
fallback to filtering in the iterator).

Fixes: CVE-2023-3300
Fixes: #17906
2023-07-19 10:38:08 -04:00
Luiz Aoqui
54c45ed106 acl: fix parsing of policies with blocks w/o label
An ACL policy with a block without label generates unexpected results.
For example, a policy such as this:

```
namespace {
  policy = "read"
}
```

Is applied to a namespace called `policy` instead of the documented
behaviour of applying it to the `default` namespace.

This happens because of the way HCL1 decodes blocks. Since it doesn't
know if a block is expected to have a label it applies the `key` tag to
the content of the block and, in the example above, the first key is
`policy`, so it sets that as the `namespace` block label.

Since this happens internally in the HCL decoder it's not possible to
detect the problem externally.

Fixing the problem inside the decoder is challenging because the JSON
and HCL parsers generate different ASTs that makes impossible to
differentiate between a JSON tree from an invalid HCL tree within the
decoder.

The fix in this commit consists of manually parsing the policy after
decoding to clear labels that were not set in the file. This allows the
validation rules to consistently catch and return any errors, no matter
if the policy is an invalid HCL or JSON.
2023-07-19 10:38:08 -04:00
Charlie Voiselle
d23aaed14b redact token before passing to sentinel 2023-07-19 10:38:08 -04:00
James Rasell
0015d25344 qemu: add test to cover task store functions. (#17967) 2023-07-19 15:35:16 +01:00
James Rasell
81aa274551 qemu: fix log lines to use correct QEMU capitalization. (#17961) 2023-07-19 15:21:15 +01:00
James Rasell
81c14dee3c test: enable exec test previously disabled due to CircleCI (#17975) 2023-07-19 15:15:11 +01:00
James Rasell
3abb1124c3 copywrite: add placeholder for OSS/ENT ignore split. (#17965) 2023-07-19 11:29:06 +01:00
Charlie Voiselle
3a687930bd Fix typos (#17962) 2023-07-18 13:31:36 -04:00
Seth Hoenig
1e7726ce93 docs: note windows requirement for workload identity (#17950)
Support for UDS sockets was added to Windows 10.
2023-07-14 12:51:25 -05:00
James Rasell
7f5d39fc02 docs: fix QEMU driver HCL plugin options example formatting. (#17930) 2023-07-14 10:22:45 +01:00
János Szathmáry
e53955bccc fix allowed metric values documentation for the Nomad APM plugin (#17928) 2023-07-13 18:25:40 -04:00
Phil Renaud
437941816c Tells the token to be 2 seconds faster in a 5.5 second test (#17924) 2023-07-12 16:59:22 -04:00
Seth Hoenig
159bf51120 e2e: add some e2e tests for pledge task driver (#17909)
* e2e: setup nomad for pledge driver

* e2e: add some e2e tests for pledge task driver
2023-07-12 11:56:08 -05:00
James Rasell
74335b3bfe ci: add copywrite action to check file headers. (#17889) 2023-07-12 16:02:43 +01:00
Daniel Kimsey
995b936aca Smoke test binaries for EL7 compatiblity (#17706)
This adds a quick smoke test of our binaries to verify we haven't exceeeded the
maximum GLIBC (2.17) version during linking which would break our ability to
execute on EL7 machines.
2023-07-12 10:51:26 -04:00
Tim Gross
3c481d3f25 Merge pull request #17914 from hashicorp/post-1.6.0-rc.1-release
Post 1.6.0 rc.1 release
2023-07-12 10:03:59 -04:00
hc-github-team-nomad-core
09c89e79d2 Prepare for next release 2023-07-12 09:54:14 -04:00
hc-github-team-nomad-core
335bb8b9e1 Generate files for 1.6.0-rc.1 release 2023-07-12 09:54:14 -04:00
Tim Gross
3656de6fe2 Prepare release 1.6.0-rc.1 2023-07-12 09:54:14 -04:00
James Rasell
3cfa267439 changelog: fix link to unsupported changelog file. (#17913) 2023-07-12 14:45:53 +01:00
Seth Hoenig
8253ec86a2 docs: add plugin docs for pledge task driver (#17823)
* docs: add plugin docs for pledge task driver

Add pledge driver to the set of Community drivers.

* docs: cr feedback
2023-07-11 16:41:14 -05:00
Seth Hoenig
fd50f2bcb8 e2e: do not set a user for raw_exec tasks (#17901)
Cannot set a user for raw_exec tasks, because doing so does not work
with the 0700 root owned client data directory that we setup in the e2e
cluster in accordance with the Nomad hardening guide.
2023-07-11 16:00:15 -05:00
Seth Hoenig
a4d0dcdc39 docs: update podman driver docs with v0.5.0 changes (#17824) 2023-07-11 13:51:25 -05:00
Seth Hoenig
80b9ff6436 docs: clarify using user on raw_exec driver (#17897) 2023-07-11 13:06:46 -05:00
Adrian Todorov
ef89b692d8 docs: clarify update stagger description and alternatives (#17896) 2023-07-11 13:53:21 -04:00
Seth Hoenig
01cb47be1b website: ignore user .env.* files (#17898) 2023-07-11 10:06:48 -05:00
Luiz Aoqui
99fb36e119 np: update docs and add test for nil lists (#17899)
Document and test that if a namespace does not provide an `allow` or
`deny` list than those are treated as `nil` and have a different
behaviour from an empty list (`[]string{}`).
2023-07-11 10:59:45 -04:00
Kévin Dunglas
79773a031f docs: fix typo in regex_replace.mdx (#17891) 2023-07-11 14:03:40 +01:00
Lance Haig
1541358ef3 Add the ability to customise the details of the CA (#17309)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-07-11 08:53:09 +01:00
hashicorp-copywrite[bot]
e178906ed4 [COMPLIANCE] Add Copyright and License Headers (#17877)
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
2023-07-11 07:48:11 +01:00
Michael Schurter
6e9514920d remove empty file (#17853) 2023-07-10 16:34:10 -07:00
Michael Schurter
5169950562 docs: v1.6.0 requires ipc_lock cap for mlock (#17881)
Fixes #17780
2023-07-10 11:53:07 -07:00
Tim Gross
0ba7d0036b CSI: persist previous mounts on client to restore during restart (#17840)
When claiming a CSI volume, we need to ensure the CSI node plugin is running
before we send any CSI RPCs. This extends even to the controller publish RPC
because it requires the storage provider's "external node ID" for the
client. This primarily impacts client restarts but also is a problem if the node
plugin exits (and fingerprints) while the allocation that needs a CSI volume
claim is being placed.

Unfortunately there's no mapping of volume to plugin ID available in the
jobspec, so we don't have enough information to wait on plugins until we either
get the volume from the server or retrieve the plugin ID from data we've
persisted on the client.

If we always require getting the volume from the server before making the claim,
a client restart for disconnected clients will cause all the allocations that
need CSI volumes to fail. Even while connected, checking in with the server to
verify the volume's plugin before trying to make a claim RPC is inherently racy,
so we'll leave that case as-is and it will fail the claim if the node plugin
needed to support a newly-placed allocation is flapping such that the node
fingerprint is changing.

This changeset persists a minimum subset of data about the volume and its plugin
in the client state DB, and retrieves that data during the CSI hook's prerun to
avoid re-claiming and remounting the volume unnecessarily.

This changeset also updates the RPC handler to use the external node ID from the
claim whenever it is available.

Fixes: #13028
2023-07-10 13:20:15 -04:00
Devashish Taneja
b31e891e5f Include parent job ID as a Docker container label (#17843)
Fixes: #17751
2023-07-10 11:27:45 -04:00
Daniel Bennett
34105f1d43 ci: more self-hosted iops for checks workflow (#17852) 2023-07-10 10:21:04 -05:00
James Rasell
079f5d4d8d docs: detail Consul ACL token env var config option. (#17859) 2023-07-10 14:26:18 +01:00
dependabot[bot]
e8683e3f49 build(deps): bump github.com/hashicorp/cronexpr in /api (#17787) 2023-07-10 11:23:00 +01:00
James Rasell
f43a3c9f37 e2e: respect timeout value when waiting for allocs in v3. (#17800) 2023-07-10 09:47:10 +01:00
Tim Gross
18327cd367 consul: handle "not found" errors from Consul when deleting tokens (#17847)
In Consul 1.15.0, the Delete Token API was changed so as to return an error when
deleting a non-existent ACL token. This means that if Nomad successfully deletes
the token but fails to persist that fact, it will get stuck trying to delete a
non-existent token forever.

Update the token deletion function to ignore "not found" errors and treat them
as successful deletions.

Fixes: #17833
2023-07-07 16:22:13 -04:00
Daniel Bennett
243429be11 ci: pull secrets from Vault in nomad-enterprise (#17841) 2023-07-07 14:27:12 -05:00
Seth Hoenig
100c460467 env/aws: updates from ec2info (#17835) 2023-07-07 10:12:05 -05:00