Commit Graph

27497 Commits

Author SHA1 Message Date
James Rasell
8e553ad95b build: Add tzdata to Docker container final image. (#26794)
Nomad's periodic block includes a "time_zone" parameter which lets
operators set the time zone at which the next launch interval is
checked against. For this to work, Nomad needs to use the
"time.LoadLocation" which in-turn can use multiple TZ data sources.

When using the Docker image to trigger Nomad job registrations, it
currently does not have access to any TZ data, meaning it is only
aware of UTC. Adding the tzdata package contents to the release
image provides the required data for this to work.

It would have also been possible to set the "-tags" build tag when
releasing Nomad which would embed a copy of the timezone database
in the code. We decided against using the build tag approach as it
is a subtle way that we could introduce bugs that are very
difficult to track down and we prefer the commit approach.
2025-09-19 08:55:57 +01:00
ethel-hashicorp
6ea57a589d SMRE-733: Updates post-install text to properly reflect the updated IPLA blurb (#26791) 2025-09-19 07:35:58 +01:00
Piotr Kazmierczak
f42239bf6c api: add DefaultUpdateStrategy to system jobs if missing (#26777)
From 1.11, Nomad system jobs will feature deployments, and thus jobspecs missing
an update block should be canonicalized to have one.
2025-09-18 15:21:23 +02:00
Tim Gross
3ef25e5867 ACL: allow workload identities to list/get their own policies (#26772)
In most RPC endpoints we use the resolved ACL object to determine whether a
given auth token or identity has access to the object of interest to the
RPC. In #15870 we adjusted this across most of the RPCs to handle workload identity.

But in the ACL endpoints that read policies, we can't use the resolved ACL
object and have to go back to the original token and lookup the policies it has
access to. So we need to resolve any workload-associated policies during that
lookup as well.

Fixes: https://github.com/hashicorp/nomad/issues/26764
Ref: https://hashicorp.atlassian.net/browse/NMD-990
Ref: https://github.com/hashicorp/nomad/pull/15870
2025-09-18 09:10:37 -04:00
James Rasell
a206ff3858 test: Fix test flake in client get registration token (#26796)
The test was incorrectly writing to state that registration had
been finished before writing the node identity token. This is the
opposite of what happens in the client code and caused a timing
issue which meant we read registration as completed before we had
the identity available and therefore returned the secret ID.
2025-09-18 13:56:17 +01:00
Piotr Kazmierczak
46dfd9d992 scheduler: do not create deployments for system job reschedules (#26789)
System jobs that get rescheduled should not get new deployments.
2025-09-18 14:54:54 +02:00
Tim Gross
3432b0a2d6 consul: only add fingerprint link if unique.consul.name is set (#26787)
In Nomad Enterprise we can fingerprint multiple Consul datacenters. If neither
is `"default"` then we end up with warning logs about adding a "link".

The `Link` field on the `Node` struct is a map of attributes that only
contributes to the node's computed hash. The `"consul"` key's value is derived
from the `unique.consul.name` attribute, which only exists if there's a default
Consul cluster.

Update the fingerprint to skip setting the link field if there's no
`unique.consul.name`, and lower the warning log for malformed fields to debug;
this is a minor scheduling optimization largely captured by existing Consul
fields in the node computed class. The only reason not to remove it entirely is
to avoid changing computed classes on existing large clusters.

Fixes: https://github.com/hashicorp/nomad/issues/26781
Ref: https://hashicorp.atlassian.net/browse/NMD-998
2025-09-17 13:23:01 -04:00
Jeff Boruszak
6dce21bc85 Merge pull request #26682 from hashicorp/docs/versioned-redirect-fix
docs: Versioned docs redirect fixes
2025-09-17 08:58:37 -07:00
Tim Gross
4e75e99f1a windows: use/accept platform-specific signal for stopping agent (#26780)
On Windows, the `os.Process.Signal` method returns an error when sending
`os.Interrupt` (SIGINT) because it isn't implemented. This causes test servers
in the `testutil` packages to break on Windows. Use the platform specific
syscalls to generate the SIGINT instead.

The agent's signal handler also did not correctly handle the Ctrl-C because we
were masking os.Interrupt instead of SIGINT.

Fixes: https://github.com/hashicorp/nomad/issues/26775

Co-authored-by: Chris Roberts <croberts@hashicorp.com>
2025-09-17 11:32:20 -04:00
Aimee Ukasick
fca783c566 Add 1.10.5 release notes (#26782) 2025-09-17 08:59:43 -05:00
James Rasell
ac5a77af56 docs: Add client identity HTTP API detail on api-docs page. (#26774)
Co-authored-by: Aimee Ukasick <Aimee.Ukasick@ibm.com>
2025-09-17 14:05:37 +01:00
Piotr Kazmierczak
4874622ebd e2e: test canary updates for system jobs (#26776) 2025-09-17 10:20:03 +02:00
boruszak
8ab61f37b3 Fix accidental "s 2025-09-16 14:23:59 -07:00
Michael Smithhisler
1a19a16ee9 docs: fix link in multiregion job spec page (#26755) 2025-09-16 13:00:42 -05:00
James Rasell
2abd72d433 http: Fix client identity renew call when node ID is in URI. (#26773)
When calling the client identity renew API, it is possible the
target node ID is provided by either the URI or within the request
body. This change fixes a bug where all calls using a node_id query
parameter would be reject as it failed to decode the empty request
body.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-09-16 15:15:39 +01:00
Olli Janatuinen
6398ef9475 secrets: Support custom plugins in Windows (#26751)
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2025-09-16 09:14:50 -04:00
Daniel Bennett
f47cb5d10f e2e: adjust flaky timings (#26771)
hopefully fixes:

```
TestOversubscription/testExec:
    oversubscription_test.go:57: submitting job: "./input/exec.hcl"
    oversubscription_test.go:72:
        oversubscription_test.go:72: expected condition to pass within wait context
        ↪ error: wait: timeout exceeded: expect '31457280' in stdout, got: 'stat {...}/cat.stdout.0: no such file or directory'
```

and in separate runs,

```
TestTaskAPI/testTaskAPI_Auth:
     taskapi_test.go:85:
         taskapi_test.go:85: expected string to have suffix
         ↪ suffix: Unauthorized
         ↪ string:
```

```
TestTaskAPI/testTaskAPI_Auth:
     taskapi_test.go:85:
         taskapi_test.go:85: expected string to have suffix
         ↪ suffix: Forbidden
         ↪ string:
```
2025-09-15 15:54:53 -04:00
dependabot[bot]
ababacc9ab chore(deps): bump github.com/shoenig/test from 1.12.1 to 1.12.2 in /api (#26757)
* chore(deps): bump github.com/shoenig/test from 1.12.1 to 1.12.2 in /api

Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.12.1 to 1.12.2.
- [Release notes](https://github.com/shoenig/test/releases)
- [Commits](https://github.com/shoenig/test/compare/v1.12.1...v1.12.2)

---
updated-dependencies:
- dependency-name: github.com/shoenig/test
  dependency-version: 1.12.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* root dep needs to be updated too

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-09-15 09:06:41 -04:00
dependabot[bot]
2baeffec92 chore(deps-dev): bump prettier from 3.5.3 to 3.6.2 in /website (#26162)
Bumps [prettier](https://github.com/prettier/prettier) from 3.5.3 to 3.6.2.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.5.3...3.6.2)

---
updated-dependencies:
- dependency-name: prettier
  dependency-version: 3.6.2
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:51:31 -04:00
dependabot[bot]
be1fdc0d53 chore(deps): bump golang.org/x/crypto from 0.41.0 to 0.42.0 (#26758)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.41.0 to 0.42.0.
- [Commits](https://github.com/golang/crypto/compare/v0.41.0...v0.42.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.42.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:48:31 -04:00
dependabot[bot]
16533b3d34 chore(deps): bump google.golang.org/grpc from 1.75.0 to 1.75.1 (#26760)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.75.0 to 1.75.1.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.75.0...v1.75.1)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.75.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:48:15 -04:00
dependabot[bot]
24ef9fa928 chore(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#26762)
Bumps [github.com/aws/aws-sdk-go-v2/feature/ec2/imds](https://github.com/aws/aws-sdk-go-v2) from 1.18.6 to 1.18.7.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.7/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.18.6...config/v1.18.7)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/feature/ec2/imds
  dependency-version: 1.18.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:16:49 -04:00
dependabot[bot]
5d0d5d2b22 chore(deps): bump github.com/zclconf/go-cty from 1.16.4 to 1.17.0 (#26761)
Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.16.4 to 1.17.0.
- [Release notes](https://github.com/zclconf/go-cty/releases)
- [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md)
- [Commits](https://github.com/zclconf/go-cty/compare/v1.16.4...v1.17.0)

---
updated-dependencies:
- dependency-name: github.com/zclconf/go-cty
  dependency-version: 1.17.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:16:37 -04:00
dependabot[bot]
da9a25d77d chore(deps): bump golang.org/x/time from 0.12.0 to 0.13.0 (#26759)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.12.0 to 0.13.0.
- [Commits](https://github.com/golang/time/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-version: 0.13.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 08:16:12 -04:00
James Rasell
a7db1b42b8 acl: Migrate all tests from testify to must. (#26704) 2025-09-15 08:21:49 +01:00
Chris Roberts
10be73c081 ci: fix github to jira issue sync (#26747)
Add local actions for JIRA interactions to replace github actions
that have been archived.
2025-09-12 13:40:11 -07:00
Tim Gross
ac86225e09 metrics: reduce heap usage of eval broker metrics (#26737)
The metrics on the eval broker include labels for the job ID, but under a high
volume of dispatch workloads, this results in excessive heap usage on the
leader. Dispatch workloads should use their parent ID rather than their child ID
for any metrics we collect.

Also, eliminate an extra copy of the labels. And remove the extremely high
cardinality `"eval_id"` label from the `nomad.broker.eval_waiting` metric.

Fixes: https://github.com/hashicorp/nomad/issues/26657
2025-09-12 08:29:46 -04:00
Michael Smithhisler
c20f854d16 client: set network status on tasks when restoring allocations (#26699)
The allocation network hook was not properly restoring network status from state when the network had previously been setup.  This led to missing environment variables, misconfigured hosts file, and resolv.conf when a task was restarted after the nomad agent has restarted.
---------

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-09-11 13:10:21 -04:00
Chris Roberts
8b51acf259 [artifact] fix path within check on trimmed target (#26748)
When checking if the target path is within the root path, the
target path is trimmed and then file information is fetched. If
the trimmed path does not exist, then the full target path is
not within the root. In the case of receiving a not exist error,
simply return false.
2025-09-11 08:59:18 -07:00
Piotr Kazmierczak
8eb72b2868 Post 1.10.5 release (#26749)
* Generate files for 1.10.5 release

* Prepare for next release

---------

Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>
2025-09-11 14:49:12 +02:00
hc-github-team-nomad-core
4c0e5b286b Prepare for next release 2025-09-11 10:20:15 +02:00
hc-github-team-nomad-core
f9bce13f8c Generate files for 1.10.5 release 2025-09-11 10:20:15 +02:00
Michael Smithhisler
f58e915bd3 scheduler: allow device count to use different vendors/models (#26649)
A small optimization in the scheduler required users to specify specific
models of devices if the required count was higher than the individual
model/vendor on the node. This change removes that optimization to allow
for more intuitive device scheduling when different vendor/model device
types exist on a node.
2025-09-10 07:12:38 -04:00
tehut
68d767654a ci: remove mkdir from action for release runners (#26743) 2025-09-10 09:13:49 +02:00
tehut
bfd64b5f98 build:replicate nomad-enterprise 557e533 (#26741) 2025-09-09 17:02:08 -07:00
Tim Gross
75774711f0 eliminate dead Vault-related code from nomad/structs (#26736)
When we removed the legacy Vault token workflow, we left behind a few bits of
code that only served that workflow. Remove the dead code.
2025-09-09 12:12:57 -04:00
Michael Smithhisler
37da98be1c Merge pull request #26681 from hashicorp/NMD-760-nomad-secrets-block
Secrets Block: merge feature branch to main
2025-09-09 10:46:18 -04:00
Tim Gross
0b69999698 Revert go-getter update (#26731)
The `go-getter` update in https://github.com/hashicorp/nomad/pull/26713 is not passing tests upstream (apparently https://github.com/hashicorp/go-getter/pull/548 is the origin of the problem but that PR did not ever run tests). The issue being fixed isn't a critical vulnerability, so in the interest of preparing us for the next release, revert the `go-getter` change but keep the Go toolchain update.

We'll skip go-getter 1.8.0 and pick up the next patch version once its issues are fixed.
Reverts commit 8a96929870.
2025-09-09 09:28:08 -04:00
Daniel Bennett
cb3e49f3e4 e2e: shorten restart delay in docker registry task (#26729)
tests that use this local docker registry (docker and podman tests)
occasionally flake, I think due to the timeout being reached,
despite passing after a restart.

> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Task received by client
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Building Task Directory
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Task started by client
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Exit Code: 1
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Task restarting in 16.212149445s
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Task started by client
> jobs3.go:658: tg 'create-files' task 'create-auth-file' event: Exit Code: 0

setting the delay lower will (hopefully) keep within the job timeout.

I'm not sure why the `pledge` task apparently flakes like this;
I could find no useful info in the logs.
2025-09-08 15:21:08 -04:00
Tim Gross
db8ecac20d docs: include Consul namespace claim mapping in auth config example (#26730)
When configuring Nomad Enterprise with Consul Enterprise and multiple
namespaces, you need to include the `consul_namespace` mapping in the auth
method configuration. Otherwise you'll see an error like "unknown variable
accessed: value.consul_namespace". There's no example of the updated auth method
configuration you need, which makes this detail unclear when we're showing the
claim being used in the following `consul acl auth-method create` command.
2025-09-08 15:15:47 -04:00
dependabot[bot]
e8d5cfb77d chore(deps): bump github.com/hashicorp/go-plugin from 1.6.3 to 1.7.0 (#26716)
Bumps [github.com/hashicorp/go-plugin](https://github.com/hashicorp/go-plugin) from 1.6.3 to 1.7.0.
- [Release notes](https://github.com/hashicorp/go-plugin/releases)
- [Changelog](https://github.com/hashicorp/go-plugin/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/go-plugin/compare/v1.6.3...v1.7.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-plugin
  dependency-version: 1.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 18:44:51 +02:00
dependabot[bot]
7ccd017bc8 chore(deps): bump github.com/prometheus/common from 0.65.0 to 0.66.1 (#26717)
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.65.0 to 0.66.1.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/common/compare/v0.65.0...v0.66.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-version: 0.66.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 18:08:30 +02:00
dependabot[bot]
1498ec6c2e chore(deps): bump go.etcd.io/bbolt from 1.4.2 to 1.4.3 (#26720)
Bumps [go.etcd.io/bbolt](https://github.com/etcd-io/bbolt) from 1.4.2 to 1.4.3.
- [Release notes](https://github.com/etcd-io/bbolt/releases)
- [Commits](https://github.com/etcd-io/bbolt/compare/v1.4.2...v1.4.3)

---
updated-dependencies:
- dependency-name: go.etcd.io/bbolt
  dependency-version: 1.4.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 18:07:37 +02:00
Tim Gross
f86a141026 scheduler: don't sort reserved port ranges before adding to bitmap (#26712)
During a large volume dispatch load test, I discovered that a lot of the total
scheduling time is being spent calling `structs.ParsePortRanges` repeatedly, in
order to parse the reserved ports configuration of the node (ex. converting
`"80,8000-8001"` to `[]int{80, 8000, 8001}`). A close examination of the
profiles shows that the bulk of the time is being spent hashing the keys for the
map of ports we use for de-duplication, and then sorting the resulting slice.

The `(*NetworkIndex) SetNode` method that calls the offending `ParsePortRanges`
merges all the ports into the `UsedPorts` map of bitmaps at scheduling
time. Which means the consumer of the slice is already de-duplicating and
doesn't care about the order. The only other caller of `ParsePortRanges` is when
we validate the configuration file, and that throws away the slice entirely.

By skipping de-duplication and not sorting, we can cut down the runtime of this
function by 30x and memory usage by 3x.

Ref: https://github.com/hashicorp/nomad/blob/v1.10.4/nomad/structs/network.go#L201
Fixes: https://github.com/hashicorp/nomad/issues/26654
2025-09-08 12:05:21 -04:00
Daniel Bennett
1f7f51ceb4 e2e: update cni plugins (#26724)
> failed to configure network: plugin type="firewall" failed (add):
> incompatible CNI versions; config is "1.0.0", plugin supports ["0.4.0"]
2025-09-08 11:52:23 -04:00
dependabot[bot]
49d451a1a3 chore(deps): bump github.com/docker/docker (#26718)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.3.3+incompatible to 28.4.0+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.3.3...v28.4.0)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-version: 28.4.0+incompatible
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 17:18:18 +02:00
Deniz Onur Duzgun
8a96929870 bump: go and go-getter versions (#26713)
* bump: go and go-getter versions

* add changelog
2025-09-08 11:10:25 -04:00
dependabot[bot]
00fd92a1d4 chore(deps): bump github.com/hashicorp/cronexpr in /api (#26715)
Bumps [github.com/hashicorp/cronexpr](https://github.com/hashicorp/cronexpr) from 1.1.2 to 1.1.3.
- [Release notes](https://github.com/hashicorp/cronexpr/releases)
- [Commits](https://github.com/hashicorp/cronexpr/compare/v1.1.2...v1.1.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/cronexpr
  dependency-version: 1.1.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 17:06:16 +02:00
Michael Smithhisler
56b7a8da5c secrets: add changelog for secret block 2025-09-05 16:09:33 -04:00
Michael Smithhisler
10ed46cbd4 secrets: pass key/value config data to plugins as env (#26455)
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-09-05 16:08:24 -04:00