Commit Graph

27315 Commits

Author SHA1 Message Date
Aimee Ukasick
a30cb2f137 Update UI, code comment, and README links to docs, tutorials (#26429)
* Update UI, code comment, and README links to docs, tutorials

* fix typo in ephemeral disks learn more link url

* feedback on typo

Co-authored-by: Tim Gross <tgross@hashicorp.com>

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-08-06 09:40:23 -05:00
James Rasell
1c63ad50d9 Merge pull request #26430 from hashicorp/f-NMD-763-introduction
introduction: The initial implementation code for node introduction.
2025-08-06 14:41:16 +02:00
James Rasell
622def8bcf test: Ensure client rpclogger is set on RPC only client. (#26443)
If a test encounters an RPC error using the test client, it will
panic as the rpc logger is not set when it attempts to log the
error.
2025-08-06 10:20:28 +01:00
Michael Schurter
0f630004b9 docs: Once -> once (#26435) 2025-08-05 11:10:25 -07:00
Tim Gross
0ae5b3f39b eval status: sort plan annotations by task group (#26428)
The plan annotations table isn't sorted by task group, which makes for a less
beautiful UX and a flaky test.
2025-08-05 09:36:12 -04:00
James Rasell
ad508616dc Merge branch 'main' into f-NMD-763-introduction 2025-08-05 08:56:51 +01:00
James Rasell
350662c88e Merge pull request #26291 from hashicorp/f-NMD-763-identity
identity: The initial implementation code for node identity.
2025-08-05 09:52:28 +02:00
James Rasell
80a26306bf intro: Add node introduction flow for Nomad client registration. (#26405)
This change implements the client -> server workflow for Nomad
node introduction. A Nomad node can optionally be started with an
introduction token, which is a signed JWT containing claims for
the node registration. The server handles this according to the
enforcement configuration.

The introduction token can be provided by env var, cli flag, or
by placing it within a default filesystem location. The latter
option does not override the CLI or env var.

The region claims has been removed from the initial claims set of
the intro identity. This boundary is guarded by mTLS and aligns
with the node identity.
2025-08-05 08:23:44 +01:00
Tim Gross
8f74807891 tests: fix conflict from parallelism in state store variables test (#26426)
The state store test for Variables check-and-set behavior for deletes uses the
same state store for a set of parallel tests. But one of the tests overlaps
another by using the same path, and this can cause spurious test failures by
hitting the CAS conflict error. This overlap doesn't appear to be intentional,
so change the test to use a different path.

Also cleaned up some unused test helpers in the same file.
2025-08-04 17:03:21 -04:00
tehut
21841d3067 Add historical journald and log export flags to operator debug command (#26410)
* Add -log-file-export and -log-lookback commands to add historical log to
debug capture
* use monitor.PrepFile() helper for other historical log tests
2025-08-04 13:55:25 -07:00
Daniel Bennett
7c633f8109 exec: don't panic on rootless raw_exec tasks (#26401)
the executor dies, leaving an orphaned process still running.

the panic fix:
 * don't `panic()`
 * and return an empty, but non-nil, func on cgroup error

feature fix:
 * allow non-root agent to proceed with exec when cgroups are off
2025-08-04 13:58:35 -04:00
Tim Gross
9859f4a140 document version check requirement on Raft message types (#26411)
Whenever we add a new Raft message type, we almost always need to add a new
version check to ensure that leaders aren't trying to write unknown Raft entries
to older followers. Leave a note about this where the edits happen to reduce the
risk of this unfortunately common bug.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2973
2025-08-04 12:07:27 -04:00
dependabot[bot]
8eaf7b80ee chore(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.3 to 5.3.0 (#26416)
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.3 to 5.3.0.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.3...v5.3.0)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-version: 5.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 10:30:27 +02:00
dependabot[bot]
7ed9d168ae chore(deps): bump github.com/hashicorp/go-set/v3 from 3.0.0 to 3.0.1 (#26414)
Bumps [github.com/hashicorp/go-set/v3](https://github.com/hashicorp/go-set) from 3.0.0 to 3.0.1.
- [Release notes](https://github.com/hashicorp/go-set/releases)
- [Commits](https://github.com/hashicorp/go-set/compare/v3.0.0...v3.0.1)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-set/v3
  dependency-version: 3.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 10:21:57 +02:00
dependabot[bot]
57e7f8f28d chore(deps): bump github.com/prometheus/client_golang (#26413)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.22.0 to 1.23.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.22.0...v1.23.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-version: 1.23.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 09:51:06 +02:00
dependabot[bot]
7790dd1c65 chore(deps): bump github.com/aws/aws-sdk-go-v2/config (#26412)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.29.18 to 1.30.2.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.29.18...v1.30.2)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-version: 1.30.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-04 09:50:12 +02:00
tehut
d709accaf5 Add nomad monitor export command (#26178)
* Add MonitorExport command and handlers
* Implement autocomplete
* Require nomad in serviceName
* Fix race in StreamReader.Read
* Add and use framer.Flush() to coordinate function exit
* Add LogFile to client/Server config and read NomadLogPath in rpcHandler instead of HTTPServer
* Parameterize StreamFixed stream size
2025-08-01 10:26:59 -07:00
Gautam Kumar
6f81222ec8 CL: improve acl policy self output for management tokens (#26396)
Improved the acl policy self CLI command to handle both management and client tokens.
Management tokens now display a clear message indicating global access with no individual policies.

Fixes: https://github.com/hashicorp/nomad/issues/26389
2025-08-01 09:02:47 -04:00
Aimee Ukasick
5dc7e7fe25 Docs: Chore: Ent labels (#26323)
* replace outdated tutorial links

* update more tutorial links

* Add CE/ENT or ENT to left nav

* remove ce/ent labels

* revert enterprise features
2025-07-30 09:02:28 -05:00
dependabot[bot]
1209c34be1 chore(deps): bump github.com/docker/docker (#26390)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.3.2+incompatible to 28.3.3+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.3.2...v28.3.3)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-version: 28.3.3+incompatible
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-29 16:21:18 -04:00
Tim Gross
4ce937884d scheduler: move result mutation into computeStop (#26351)
The `computeStop` method returns two values that only get used to mutate the
result and the untainted set. Move the mutation into the method to match the
work done in #26325.

Ref: https://github.com/hashicorp/nomad/pull/26325
Ref: https://hashicorp.atlassian.net/browse/NMD-819
2025-07-29 08:23:06 -04:00
Tim Gross
e062f87b07 docs: fix typo in redirect URL domain (#26384) 2025-07-28 16:28:27 -04:00
Tim Gross
501608ca68 docs: document handling of unset affinity/constraint values (#26354)
Affinities and contraints use similar feasibility checking logic to determine if
a given node matches (although affinities don't support all the same
operators). Most operators don't allow `value` to be unset. Update the docs to
reflect this.

Fixes: https://github.com/hashicorp/nomad/issues/24983
2025-07-28 14:12:43 -04:00
Tim Gross
b286a8ee9c docs: update Consul/Vault compatibility matrix (#26368)
Update our support matrix to show currently-supported versions of Consul, Vault,
and Nomad.
2025-07-28 13:48:38 -04:00
Tim Gross
192dec4297 docs: fix self-referencing link for raw_exec driver config (#26353)
During the big docs rearchitecture, we split up the task driver pages into
separate job declaration and driver configuration pages. The link for the
`raw_exec` driver to the configuration page is a self-reference.
2025-07-28 13:48:23 -04:00
Tim Gross
513ec02486 docs: explain access modes for CSI and DHV volumes (#26352)
The documentation for CSI and DHV has a list of the available access modes, but
doesn't explain what they mean in terms of what jobs can request, the scheduler
behavior, or the CSI plugin behavior. Expand on the information available in the
CSI specification and provide a description of DHV's behavior as well.

Ref: https://github.com/container-storage-interface/spec/blob/master/spec.md#createvolume
2025-07-28 13:48:01 -04:00
Tim Gross
6e5ecb6bb0 E2E: update Consul/Vault compat versions tested (#26369)
Update our E2E compatibility test for Consul and Vault to only include back to
the oldest-supported LTS versions of Consul and Vault. This will still leave
a few unsupported non-LTS versions in the matrix between the two oldest LTS, but
this is a small number of tests and fixing it would mean hard-coding the LTS
support matrix in our tests.
2025-07-28 12:03:30 -04:00
dependabot[bot]
d418260b6d chore(deps): bump google.golang.org/grpc from 1.73.0 to 1.74.2 (#26357)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.73.0 to 1.74.2.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.73.0...v1.74.2)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.74.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-28 11:27:49 -04:00
dependabot[bot]
a90f82bd0f chore(deps): bump github.com/aws/smithy-go from 1.22.4 to 1.22.5 (#26355)
Bumps [github.com/aws/smithy-go](https://github.com/aws/smithy-go) from 1.22.4 to 1.22.5.
- [Release notes](https://github.com/aws/smithy-go/releases)
- [Changelog](https://github.com/aws/smithy-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/smithy-go/compare/v1.22.4...v1.22.5)

---
updated-dependencies:
- dependency-name: github.com/aws/smithy-go
  dependency-version: 1.22.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-28 11:00:15 -04:00
James Rasell
fe42c5bab0 ci: Revert hclogvet running across entire codebase. (#26365)
It seems the tool requires a little attention and does not run
well across our enterprise codebase. Rolling back that makefile
change, so it does not stop enterprise work, backport, CI, etc.
2025-07-28 15:53:40 +01:00
dependabot[bot]
e561bdb476 chore(deps): bump github.com/hashicorp/consul-template (#26356)
Bumps [github.com/hashicorp/consul-template](https://github.com/hashicorp/consul-template) from 0.41.0 to 0.41.1.
- [Release notes](https://github.com/hashicorp/consul-template/releases)
- [Changelog](https://github.com/hashicorp/consul-template/blob/v0.41.1/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/consul-template/compare/v0.41.0...v0.41.1)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/consul-template
  dependency-version: 0.41.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-28 10:02:59 -04:00
dependabot[bot]
5bc5f4f9f1 chore(deps): bump github.com/aws/aws-sdk-go-v2/config (#26358)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.29.17 to 1.29.18.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.29.17...config/v1.29.18)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-version: 1.29.18
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-28 10:02:27 -04:00
James Rasell
f2417ffb89 ci: Update hclogvet and correctly run across codebase. (#26362) 2025-07-28 14:15:33 +01:00
James Rasell
20251b675d Add CLI and API components for creating node introduction tokens via ACL endpoint. (#26332) 2025-07-25 13:28:45 +01:00
Tim Gross
26554e544e scheduler: move result mutation into computeUpdates (#26336)
The `computeUpdate` method returns 4 different values, some of which are just
different shapes of the same data and only ever get used to be applied to the
result in the caller. Move the mutation of the result into `computeUpdates` to
match the work done in #26325. Clean up the return signature so that only slices
we need downstream are returned, and fix the incorrect docstring.

Also fix a silent bug where the `inplace` set includes the original alloc and
not the updated version. This has no functional change because all existing
callers only ever look at the length of this slice, but it will prevent future
bugs if that ever changes.

Ref: https://github.com/hashicorp/nomad/pull/26325
Ref: https://hashicorp.atlassian.net/browse/NMD-819
2025-07-25 08:21:37 -04:00
James Rasell
5989d5862a ci: Update golangci-lint to v2 and fix highlighted issues. (#26334) 2025-07-25 10:44:08 +01:00
James Rasell
842f316615 Merge branch 'main' into f-NMD-763-introduction 2025-07-25 08:27:53 +01:00
James Rasell
2ef837f02f cli: Ensure all no argument console messages are the same. (#26331)
Use a constant to ensure consistency across the CLI when displaying
a console message indicating the command was passed arguments when
it takes none.
2025-07-25 07:05:10 +01:00
Aimee Ukasick
ccaa3b7325 add table to service.port entry (#26344) 2025-07-24 14:00:05 -05:00
Tim Gross
b91d1726ce docs: clarify namespace support in autoscaler (#26337)
The current autoscaler docs implies that it has minimal or non-working support
for Nomad namespaces. Whereas in fact the namespace support works fine but just
doesn't allow configuring multiple namespaces without using a wildcard (for
now). Make this more clear and fix the reference to the configuration "below",
which is no longer on that same page.

Ref: https://github.com/hashicorp/nomad-autoscaler/issues/65
2025-07-24 12:16:24 -04:00
Aimee Ukasick
55926afe11 Docs: Clarify service.connect examples (#26330)
* Docs: CE-997 clarify connect examples

* fix DSN typos

* CE-996 clarify agent config consul.client_auto_join

* add (formerly Consul Connect)

* remove 'Nomad and Consul are
2025-07-24 10:59:03 -05:00
Tim Gross
2c4be7fc2e Reconciler mutation improvements (#26325)
Refactors of the `computeGroup` code in the reconciler to make understanding its
mutations more manageable. Some of this work makes mutation more consistent but
more importantly it's intended to make it readily _detectable_ while still being
readable. Includes:

* In the `computeCanaries` function, we mutate the dstate and the result and
  then the return values are used to further mutate the result in the
  caller. Move all this mutation into the function.

* In the `computeMigrations` function, we mutate the result and then the return
  values are used to further mutate the result in the caller. Move all this
  mutation into the function.

* In the `cancelUnneededCanaries` function, we mutate the result and then the
  return values are used to further mutate the result in the caller. Move all
  this mutation into the function, and annotate which `allocSet`s are mutated by
  taking a pointer to the set.

* The `createRescheduleLaterEvals` function currently mutates the results and
  returns updates to mutate the results in the caller. Move all this mutation
  into the function to help cleanup `computeGroup`.

* Extract `computeReconnecting` method from `computeGroup`. There's some tangled
  logic in `computeGroup` for determining changes to make for reconnecting
  allocations. Pull this out into its own function. Annotate mutability in the
  function by passing pointers to `allocSet` where needed, and mutate the result
  to update counts. Rename the old `computeReconnecting` method to
  `appendReconnectingUpdates` to mirror the naming of the similar logic for
  disconnects.

* Extract `computeDisconnecting` method from `computeGroup`. There's some
  tangled logic in `computeGroup` for determining changes to make for
  disconnected allocations. Pull this out into its own function. Annotate
  mutability in the function by passing pointers to `allocSet` where needed, and
  mutate the result to update counts.

* The `appendUnknownDisconnectingUpdates` method used to create updates for
  disconnected allocations mutates one of its `allocSet` arguments to change the
  allocations that the reschedule now set points to. Pull this update out into
  the caller.

* A handful of small docstring and helper function fixes


Ref: https://hashicorp.atlassian.net/browse/NMD-819
2025-07-24 08:33:49 -04:00
James Rasell
62f1dbebfb server: Add RPC and HTTP functionality for node intro token gen. (#26320)
The node introduction workflow will utilise JWT's that can be used
as authentication tokens on initial client registration. This
change implements the basic builder for this JWT claim type and
the RPC and HTTP handler functionality that will expose this to
the operator.
2025-07-23 14:32:26 +01:00
Tim Gross
e675491eb6 refactor uses of allocSet in reconciler (#26324)
The reconciler contains a large set of methods and functions that operate on
`allocSet` (a map of allocation IDs to their allocs). Update these so that they
are consistently methods that are documented to not consume the `allocSet`. This
sets the stage for further improvements around mutability in the reconciler.

This changeset also includes a few related refactors:
* Use the `allocSet` alias in every location it's relevant in the reconciler,
  for consistency and clarity.
* Move the filter functions and related helpers in the `allocs.go` file into the
  `filters.go` file.
* Update the method receiver on `allocSet` to match everywhere and generally
  improve the docstrings on the filter functions.

Ref: https://hashicorp.atlassian.net/browse/NMD-819
2025-07-23 08:57:41 -04:00
Jeff Boruszak
61cb8f6f10 Merge pull request #26270 from hashicorp/docs/redirects-for-versioning
docs: Versioned redirect logic
2025-07-22 14:14:23 -07:00
Aimee Ukasick
e6d63faf58 Fix typo (#26319) 2025-07-22 09:53:31 -05:00
James Rasell
7466dd71b2 server: Add new server.client_introduction config block. (#26315)
The new configuration block exposes some key options which allow
cluster administrators to control certain client introduction
behaviours.

This change introduces the new block and plumbing, so that it is
exposed in the Nomad server for consumption via internal processes.
2025-07-22 08:50:19 +01:00
Michael Smithhisler
36b4aa79df docs: fix link to nomad schedulers (#26302) 2025-07-21 08:53:29 -05:00
dependabot[bot]
66c22971b0 chore(deps): bump github.com/klauspost/cpuid/v2 from 2.2.11 to 2.3.0 (#26305) 2025-07-21 13:20:34 +01:00
dependabot[bot]
c6584e241c chore(deps): bump github.com/docker/cli (#26307) 2025-07-21 12:31:50 +01:00