Commit Graph

27394 Commits

Author SHA1 Message Date
James Rasell
9e893ef2ad e2e: Add Client Intro test framework and initial test. (#26639)
The new client intro test mimics the Consul and Vault compat tests
and uses local agents to perform the required setup. This method
allows us the flexibility moving forward to test when enforcement
mode is in strict.

The test suite will now be triggered from the test-e2e CI run
and can also be called by a make target.
2025-08-28 09:53:07 +01:00
James Rasell
9d1d5f2f03 csi: Correctly sort IDs when listing controller plugin clients. (#26640) 2025-08-28 08:05:58 +01:00
Michael Smithhisler
485356c3d3 csi: fix volume registration error (#26642) 2025-08-27 15:00:16 -04:00
Tim Gross
5f34867420 build: fix copywrite configuration file syntax (#26644)
Because the Enterprise code has a set of copywrite exclusion entries below the
one listed here in CE, we need to make sure that the last CE line in the
configuration file ends in a comma.
2025-08-27 14:15:24 -04:00
Chris Roberts
fd1e40537c [artifact] add artifact inspection after download (#26608)
This adds artifact inspection after download to detect any issues
with the content fetched. Currently this means checking for any
symlinks within the artifact that resolve outside the task or
allocation directories. On platforms where lockdown is available
(some Linux) this inspection is not performed.

The inspection can be disabled with the DisableArtifactInspection
option. A dedicated option for disabling this behavior allows
the DisableFilesystemIsolation option to be enabled but still
have artifacts inspected after download.
2025-08-27 10:37:34 -07:00
James Rasell
e5eb125264 agent: Ensure node identity renew handler decodes the request body. (#26638)
The HTTP request body contains the node ID where the request should
be routed and without decoding this, we cannot route to anything
other than local nodes.
2025-08-27 14:06:12 +01:00
James Rasell
dcfcbc8f16 ci: Enable SA5008 linting and fix discovered error. (#26633) 2025-08-27 09:24:50 +01:00
Chris Roberts
4b9597a31d [agent] Fix error checking within retry join (#26434)
The `RetryJoin` function checks for an error and logs it before
retrying. The error variables were shadowed which resulted in
the errors never being logged. This predefines the variables
to prevent them from being shadowed.

The testlog package was also updated to support providing a custom
writer which allows logging output to be easily caught and inspected.
2025-08-26 14:18:12 -07:00
James Rasell
71e66231f9 docs: Add node identity and introduction CLI, API, and config docs (#26516)
Co-authored-by: Aimee Ukasick <Aimee.Ukasick@ibm.com>
2025-08-26 15:26:00 +01:00
James Rasell
d0ffb31fea e2e: Add Client Identity get and renew tests. (#26632) 2025-08-26 13:49:06 +01:00
Allison Larson
3fff1aa3cc Support IMDSv2 on windows e2e runners (#26629) 2025-08-25 15:37:50 -07:00
Leah Bush
36d423ceda Merge pull request #26580 from hashicorp/leah/feat/upgrade-node
feat: upgrade node version to v22
2025-08-25 10:02:30 -05:00
Aimee Ukasick
bb7114e518 Docs Chore: Add release notes for 1.10.1-1.10.3 (#26593)
* add 1.10.3

* add 1.10.2

* Add 1.10.1 release notes; add partials to share

* address feedback
2025-08-25 09:38:15 -05:00
dependabot[bot]
6e44a80df0 chore(deps): bump google.golang.org/protobuf from 1.36.7 to 1.36.8 (#26614)
Bumps google.golang.org/protobuf from 1.36.7 to 1.36.8.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-version: 1.36.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-25 09:26:50 -04:00
dependabot[bot]
ac4ac733dd chore(deps): bump github.com/zclconf/go-cty from 1.16.3 to 1.16.4 (#26612)
Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.16.3 to 1.16.4.
- [Release notes](https://github.com/zclconf/go-cty/releases)
- [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md)
- [Commits](https://github.com/zclconf/go-cty/compare/v1.16.3...v1.16.4)

---
updated-dependencies:
- dependency-name: github.com/zclconf/go-cty
  dependency-version: 1.16.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-25 09:22:25 -04:00
dependabot[bot]
9a811a4762 chore(deps): bump github.com/hashicorp/cap from 0.9.0 to 0.10.0 (#26611)
Bumps [github.com/hashicorp/cap](https://github.com/hashicorp/cap) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/hashicorp/cap/releases)
- [Changelog](https://github.com/hashicorp/cap/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/cap/compare/v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/cap
  dependency-version: 0.10.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-25 09:20:00 -04:00
dependabot[bot]
e41c5094e0 chore(deps): bump github.com/aws/aws-sdk-go-v2/config (#26610)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.31.0 to 1.31.2.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.31.0...config/v1.31.2)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-version: 1.31.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-25 09:18:02 -04:00
Chris Roberts
33a72c2d01 [landlock] Allow read access for random content (#26510)
When attempting to clone a git repository within a sandbox that is
configured with landlock, the clone will fail with error messages
related to inability to get random bytes for a temporary file.
Including a read rule for `/dev/urandom` resolves the error
and the git clone works as expected.
2025-08-22 14:04:55 -07:00
Tim Gross
767683ce3e E2E: allow setting instance_type variable (#26607)
When we refactored the E2E provisioning to allow it to be reused by the upgrade
testing, we didn't thread the `instance_type` variable from the main module down
into the `provision-infra` module. This prevents you from setting a custom
instance size when deploying the E2E cluster manually.
2025-08-22 15:22:10 -04:00
Allison Larson
f6a078c7e5 Disable IMDSv2 on windows test instances (#26606) 2025-08-21 16:29:35 -07:00
Juana De La Cuesta
e7868639d6 func: add the correct value for costumer feedback on var error (#26601) 2025-08-21 15:37:53 +02:00
Michael Smithhisler
da4cf07ff4 logs: skip logging SIGPIPE signal (#26582) 2025-08-21 09:08:49 -04:00
dependabot[bot]
d8342aed76 chore(deps): bump golang.org/x/mod from 0.26.0 to 0.27.0 (#26536) 2025-08-21 11:02:54 +00:00
dependabot[bot]
ed967892f2 chore(deps): bump github.com/aws/aws-sdk-go-v2/config (#26537) 2025-08-21 10:47:57 +00:00
Allison Larson
694e0ac2e3 Require IMDSv2 for e2e EC2 instances (#26585)
Re-enables this now that go-discover is updated in all the right places.
2025-08-20 14:47:43 -07:00
Alexey Kulakov
919e5c2aa4 feat(ui): yarn -> pnpm (#26309) 2025-08-20 13:01:22 -07:00
Michael Schurter
ee5059a6a7 docs: revert to labels={"foo.bar": "baz"} style (#26535)
* docs: revert to labels={"foo.bar": "baz"} style

Back in #24074 I thought it was necessary to wrap labels in a list to
support quoted keys in hcl2. This... doesn't appear to be true at all?
The simpler `labels={...}` syntax appears to work just fine.

I updated the docs and a test (and modernized it a bit). I also switched
some other examples to the `labels = {}` format from the old `labels{}`
format.

* copywronged

* fmtd
2025-08-20 09:26:42 -07:00
dependabot[bot]
d03670fa37 chore(deps): bump golang.org/x/crypto from 0.40.0 to 0.41.0 (#26538)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.40.0 to 0.41.0.
- [Commits](https://github.com/golang/crypto/compare/v0.40.0...v0.41.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-20 18:10:13 +02:00
dependabot[bot]
d731ab7728 chore(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#26540)
Bumps [github.com/aws/aws-sdk-go-v2/feature/ec2/imds](https://github.com/aws/aws-sdk-go-v2) from 1.18.2 to 1.18.3.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.3/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.18.2...config/v1.18.3)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/feature/ec2/imds
  dependency-version: 1.18.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-20 16:48:39 +02:00
dependabot[bot]
24d9802344 chore(deps): bump github.com/docker/go-connections from 0.5.0 to 0.6.0 (#26539)
Bumps [github.com/docker/go-connections](https://github.com/docker/go-connections) from 0.5.0 to 0.6.0.
- [Commits](https://github.com/docker/go-connections/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: github.com/docker/go-connections
  dependency-version: 0.6.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-20 15:24:17 +02:00
Tim Gross
80ddb7392a scheduler: fix debug-level logging for node reconciler (#26583)
In #26169 we started emitting structured logs from the reconciler. But the node
reconciler results are `AllocTuple` structs and not counts, so the information
we put in the logs ends up being pointer addresses in hex. Fix this so that
we're recording the number of allocs in each bucket instead.

Fix another misleading log-line while we're here.

Ref: https://github.com/hashicorp/nomad/pull/26169
2025-08-19 15:17:17 -04:00
Leah Bush
07fae8440a feat: upgrade node version to v22 2025-08-19 11:29:56 -05:00
Daniel Bennett
8675fba382 e2e: install exec2 driver v0.1.0 (#26578)
for auto-unveil of NOMAD_SECRETS_DIR
following f3e08d8aa9
2025-08-19 11:28:57 -04:00
Aimee Ukasick
c17b15f8d0 change overview pages usage to use plaintext code block (#26575) 2025-08-19 09:47:37 -05:00
James Rasell
d439395b14 admin: Add Zed project settings dir to gitignore file. (#26567) 2025-08-19 15:16:46 +01:00
James Rasell
ad7d1dc094 fsm: Fix weird service registration persist loop. (#26566) 2025-08-19 14:24:03 +01:00
James Rasell
27daa745e4 namespace: Avoid potential panic when logging namespace name. (#26565)
When the namespace was not found in state, indicated by a nil
object, we were using the name field of the nil object for the
return error.

This code path does not currently get triggered as the call flow
ensures the namespace will always be found within state. Making
this change makes sure we do not hit this panic in the future.
2025-08-19 13:51:12 +01:00
Tim Gross
b8b95eb918 docs: warn against enabling Prometheus metrics if not in use (#26560)
The go-metrics library retains Prometheus metrics in memory until expiration,
but the expiration logic requires that the metrics are being regularly
scraped. If you don't have a Prometheus server scraping, this leads to
ever-increasing memory usage. In particular, high volume dispatch workloads emit
a large set of label values and if these are not eventually aged out the bulk of
Nomad server memory can end up consumed by metrics.
2025-08-19 08:44:16 -04:00
James Rasell
d13abe376c test: Deflake client agent endpoint test. (#26563) 2025-08-19 08:57:55 +01:00
James Rasell
3b0b7db1a1 client: Add client identity API, CLI, and RPC workflow. (#26543)
The Nomad clients store their Nomad identity in memory and within
their state store. While active, it is not possible to dump the
state to view the stored identity token, so having a way to view
the current claims while running aids debugging and operations.

This change adds a client identity workflow, allowing operators
to view the current claims of the nodes identity. It does not
return any of the signing key material.
2025-08-19 08:25:51 +01:00
Daniel Bennett
f3e08d8aa9 e2e: exec2: envoy binary version and tidying (#26558)
* e2e: update standalone envoy binary version

fix for:

> === FAIL: e2e/exec2 TestExec2/testCountdash (21.25s)
>     exec2_test.go:71:
> ...
> [warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:155] DeltaAggregatedResources gRPC config stream to local_agent closed: 3, Envoy 1.29.4 is too old and is not supported by Consul

there's also this warning, but it doesn't seem so fatal:

> [warning][main] [source/server/server.cc:910] There is no configured limit to the number of allowed active downstream connections. Configure a limit in `envoy.resource_monitors.downstream_connections` resource monitor.

picked latest supported from latest consul (1.21.4):

```
$ curl -s localhost:8500/v1/agent/self | jq .xDS.SupportedProxies
{
  "envoy": [
    "1.34.1",
    "1.33.2",
    "1.32.5",
    "1.31.8"
  ]
}
```

* e2e: exec2: remove extraneous bits

 * reschedule: no reschedule for batch jobs
 * unveil: nomad paths get auto-unveiled with unveil_defaults
   https://github.com/hashicorp/nomad-driver-exec2/blob/v0.1.0/plugin/driver.go#L514-L522
2025-08-18 14:58:00 -04:00
Piotr Kazmierczak
e86d815472 scheduler: avoid importing the Planner test harness in scheduler calls (#26544)
For a while now, we've had only 2 implementations of the Planner interface in
Nomad: one was the Worker, and the other was the scheduler test harness, which
was then used as argument to the scheduler constructors in FSM and job endpoint
RPC. That's not great, and one of the recent refactors made it apparent that
we're importing testing code in places we really shouldn't. We finally got
called out for it, and this PR attempts to remedy the situation by splitting the
Harness into Plan (which contains actual plan submission logic) and separating
it from testing code.
2025-08-18 19:35:34 +02:00
Daniel Bennett
fdd46e6fd3 docs: cni: add tproxy conflist example (#26532) 2025-08-18 12:04:34 -04:00
Aimee Ukasick
52b8deeb3b Docs: Add 1.10.4 release notes (#26524)
* 1.10.4 release notes

* update node version in package.json so Vercel builds

* revert node version

* address feedback; add missing "-" to debug parms
2025-08-18 11:04:06 -05:00
Austin Workman
26f02c25c6 docs: Update virt install.mdx (#26531)
Fixing plugin name in nomad client plugin config example.
2025-08-18 10:58:15 -05:00
Deniz Onur Duzgun
1f7e8cdda3 deps: bump go-getter to v1.7.9 (#26533)
* deps: bump go-getter to v1.7.9

* add changelog

* update changelog
2025-08-18 10:48:21 -04:00
Daniel Bennett
2c699b9794 sysbatch: fix panic from reschedule block (#26534)
* fix panic from nil ReschedulePolicy

commit 279775082c (pr #26279)
intended to return an error for sysbatch jobs with a reschedule block,
but in bypassing populating the `ReschedulePolicy`'s pointer fields,
a nil pointer panic occurred before the job could get rejected
with the intended error.

in particular, in `command/agent/job_endpoint.go`, `func ApiTgToStructsTG`,

```
if taskGroup.ReschedulePolicy != nil {
	tg.ReschedulePolicy = &structs.ReschedulePolicy{
		Attempts:      *taskGroup.ReschedulePolicy.Attempts,
		Interval:      *taskGroup.ReschedulePolicy.Interval,
```

`*taskGroup.ReschedulePolicy.Interval` was a nil pointer.

* fix e2e test jobs
2025-08-18 10:19:14 -04:00
James Rasell
1ae83114c1 ci: Run hclogvet across all codebase and fix found issue. (#26545) 2025-08-18 15:06:11 +01:00
Matt McQuillan
fc0265c56d admin: adjusting pr template for PCI compliance (#26541) 2025-08-18 15:01:32 +01:00
Tim Gross
d1186ae53e scheduler: don't suppress blocked evals on delay if previous expires (#26523)
In #8099 we fixed a bug where garbage collecting a job with
`disconnect.stop_on_client_after` would spawn recursive delayed evals. But when
applied to disconnected allocs with `replace=true`, the fix prevents us from
emitting a blocked eval if there's no room for the replacement.

Update the guard on creating blocked evals so that rather than checking for
`IsZero` that we check for being later than the `WaitUntil`. This separates this
guard from the logic guarding the creation of delayed evals so that we can
potentially create both when needed.

Ref: https://github.com/hashicorp/nomad/pull/8099/files#r435198418
2025-08-15 10:53:52 -04:00