Commit Graph

26890 Commits

Author SHA1 Message Date
Michael Smithhisler
077c1921ef e2e: disable IMDSv2 in tests (#25564)
Consul needs to use a newer version of go-discover that can query IMDSv2
in order for our test infrastructure to be enabled with it.
2025-03-31 12:07:45 -04:00
Sooter Saalu
e93bda31ea Update placement.mdx (#25538)
* Update placement.mdx

Added explanations on initial and blocked evaluation for placement failures.

fixes #24824

* Update website/content/docs/concepts/scheduling/placement.mdx

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>

* Update website/content/docs/concepts/scheduling/placement.mdx

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>

---------

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-03-31 09:08:06 -05:00
dependabot[bot]
d4b40a8e5e chore(deps): bump github.com/hashicorp/consul/sdk from 0.16.1 to 0.16.2 (#25549) 2025-03-31 08:14:42 +00:00
dependabot[bot]
658c8f3c5a chore(deps): bump github.com/hashicorp/go-kms-wrapping/wrappers/awskms/v2 (#25551) 2025-03-31 08:14:05 +00:00
dependabot[bot]
05ae690e6c chore(deps): bump golang.org/x/mod from 0.23.0 to 0.24.0 (#25552) 2025-03-31 08:12:49 +00:00
dependabot[bot]
5e002a750b chore(deps): bump github.com/prometheus/client_golang (#25553) 2025-03-31 08:11:57 +00:00
dependabot[bot]
3fc2451ac2 chore(deps): bump github.com/opencontainers/image-spec (#25550) 2025-03-31 08:10:08 +00:00
Michael Smithhisler
8e3625a716 e2e: create consul policies and roles in respective namespaces (#25546) 2025-03-28 13:52:49 -04:00
James Rasell
37af365cf3 deps: Update golang.org/x/net from 0.36.0 to 0.38.0 (#25543) 2025-03-28 15:13:58 +00:00
Daniel Bennett
99c25fc635 dhv: mkdir plugin parameters: uid,guid,mode (#25533)
also remove Error logs from client rpc and promote plugin Debug logs to Error (since they have more info in them)
2025-03-28 10:13:13 -05:00
Piotr Kazmierczak
e9ebbed32c drivers: unflake TestExecutor_OOMKilled (#25521)
Every now and then TestExecutor_OOMKilled would fail with: "unable to start
container process: container init was OOM-killed (memory limit too low?)" which
started happening since we upgraded libcontainer.

This PR removes manual (and arbitrary) resource limits on the test
task, since it should be OOMd with resources inherited from the
testExecutorCommandWithChroot, and it fixes a small possible goroutine leak in
the OOM checker in exec driver.
2025-03-28 11:35:02 +01:00
Piotr Kazmierczak
a1fd9da705 e2e: require IMDSv2 for ec2 instances (#25541)
Require Instance Metadata Service v2 to access EC2 instance metadata for all VMs
that run our e2e suite.
2025-03-28 09:58:51 +01:00
James Rasell
3ab1673552 sec: Suppress GO-2025-3543 for github.com/opencontainers/runc (#25536)
The vulnerability has been withdrawn but it may be a while until
it is removed from the DB used by scanning. Suppressing this
removes the false result in scanning processes. The change should
be reverted once the DB is updated.
2025-03-27 12:58:06 +00:00
Martijn Vegter
736103aa54 client: fix JSON formatted logs when failing to reserve cores (#25523)
Fixed a bug where JSON formatted logs would not show the requested and overlapping 
cores when failing to reserve cores
2025-03-27 08:52:32 -04:00
James Rasell
601e7ad3ab job: Add migrate block detail when performing task group diff (#25528) 2025-03-27 08:04:58 +00:00
Michael Smithhisler
f0e0215d56 e2e: fix consul e2e enterprise logic in bootstrapping (#25532) 2025-03-26 14:08:20 -04:00
Daniel Bennett
0e121b3c29 sanitize auth method in create/update reply (#25519)
create/update APIs only work for someone
who has the secret(s) in hand, but that someone
could be a CI system, which might log output.
2025-03-26 11:36:08 -05:00
Tim Gross
fb93c41ba7 docs: expand info on built-in mkdir dynamic host volume plugin (#25524)
Describe the built-in `mkdir` plugin in the plugin concepts docs in a little
more detail. Crosslink to there from the `plugin_id` field docs, and clarify
that the `mkdir` plugin doesn't support the capacity request fields.

Update the example plugins to avoid using volume author controlled variables in
favor of Nomad-controlled ones, to reduce the risk of path traversal, and
explain to plugin authors they'll likely want to avoid this in their own
plugins.
2025-03-26 11:21:43 -04:00
Aimee Ukasick
b8ad371cfb Docs: SEO updates to front matter description intro, install, integrations (#25416)
* install section

* nomad/intro section

* integrations section

* Feedback from review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-03-26 09:40:37 -05:00
Juana De La Cuesta
61517fcc57 Merge pull request #25520 from hashicorp/NOJIRA-update-typo
Fix for wrong function name on verify allocs script
2025-03-26 10:28:04 +01:00
Juanadelacuesta
332e859da0 Typo: Wrong function name 2025-03-26 10:06:40 +01:00
Crypto89
9c4e4afa79 csi: fix CSI ExpandVolume stagingPath (#25253)
Fix the checking of the staging path against the mountRoot on the host
rather then checking against the containerMountPoint which (probably)
never exists on the host causing it to default back the the legacy
behaviour.
2025-03-25 12:36:46 -05:00
dependabot[bot]
d67a74d0f4 chore(deps): bump github.com/gorilla/websocket in /api (#25502)
Bumps [github.com/gorilla/websocket](https://github.com/gorilla/websocket) from 1.5.0 to 1.5.3.
- [Release notes](https://github.com/gorilla/websocket/releases)
- [Commits](https://github.com/gorilla/websocket/compare/v1.5.0...v1.5.3)

---
updated-dependencies:
- dependency-name: github.com/gorilla/websocket
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 10:53:27 -04:00
dependabot[bot]
f16104ab84 chore(deps): bump github.com/shoenig/test from 1.7.1 to 1.12.1 in /api (#25501)
Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.7.1 to 1.12.1.
- [Release notes](https://github.com/shoenig/test/releases)
- [Commits](https://github.com/shoenig/test/compare/v1.7.1...v1.12.1)

---
updated-dependencies:
- dependency-name: github.com/shoenig/test
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 10:52:56 -04:00
dependabot[bot]
de723690f7 chore(deps): bump github.com/felixge/httpsnoop in /api (#25499)
Bumps [github.com/felixge/httpsnoop](https://github.com/felixge/httpsnoop) from 1.0.3 to 1.0.4.
- [Release notes](https://github.com/felixge/httpsnoop/releases)
- [Commits](https://github.com/felixge/httpsnoop/compare/v1.0.3...v1.0.4)

---
updated-dependencies:
- dependency-name: github.com/felixge/httpsnoop
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 10:02:00 -04:00
James Rasell
ea25503705 cli: Use meta response index to start monitoring volume create. (#25514) 2025-03-25 14:00:46 +00:00
dependabot[bot]
1ff8d7b3ab chore(deps): bump github.com/hashicorp/go-plugin from 1.6.2 to 1.6.3 (#25507)
Bumps [github.com/hashicorp/go-plugin](https://github.com/hashicorp/go-plugin) from 1.6.2 to 1.6.3.
- [Release notes](https://github.com/hashicorp/go-plugin/releases)
- [Changelog](https://github.com/hashicorp/go-plugin/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/go-plugin/compare/v1.6.2...v1.6.3)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:56:59 -04:00
dependabot[bot]
ba35b1d170 chore(deps): bump github.com/shoenig/test from 1.12.0 to 1.12.1 (#25506)
Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 1.12.0 to 1.12.1.
- [Release notes](https://github.com/shoenig/test/releases)
- [Commits](https://github.com/shoenig/test/compare/v1.12.0...v1.12.1)

---
updated-dependencies:
- dependency-name: github.com/shoenig/test
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:56:34 -04:00
dependabot[bot]
809985bbcd chore(deps): bump golang.org/x/time from 0.10.0 to 0.11.0 (#25505)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.10.0 to 0.11.0.
- [Commits](https://github.com/golang/time/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:55:47 -04:00
dependabot[bot]
ebe0fe9914 chore(deps): bump github.com/opencontainers/runc from 1.2.5 to 1.2.6 (#25504)
Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/opencontainers/runc/releases)
- [Changelog](https://github.com/opencontainers/runc/blob/v1.2.6/CHANGELOG.md)
- [Commits](https://github.com/opencontainers/runc/compare/v1.2.5...v1.2.6)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/runc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:55:13 -04:00
dependabot[bot]
5485457dcc chore(deps): bump github.com/docker/docker (#25503)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.0.1+incompatible to 28.0.2+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v28.0.1...v28.0.2)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:54:11 -04:00
Allison Larson
d1d8945d2e Add docker plugin config option image_pull_timeout value for default timeout (#25489)
* Add docker plugin config image_pull_timeout value for default timeout

* Add image_pull_timeout docker plugin config to docs

* Add changelog
2025-03-24 13:03:14 -07:00
Tim Gross
e168548341 provide allocrunner hooks with prebuilt taskenv and fix mutation bugs (#25373)
Some of our allocrunner hooks require a task environment for interpolating values based on the node or allocation. But several of the hooks accept an already-built environment or builder and then keep that in memory. Both of these retain a copy of all the node attributes and allocation metadata, which balloons memory usage until the allocation is GC'd.

While we'd like to look into ways to avoid keeping the allocrunner around entirely (see #25372), for now we can significantly reduce memory usage by creating the task environment on-demand when calling allocrunner methods, rather than persisting it in the allocrunner hooks.

In doing so, we uncover two other bugs:
* The WID manager, the group service hook, and the checks hook have to interpolate services for specific tasks. They mutated a taskenv builder to do so, but each time they mutate the builder, they write to the same environment map. When a group has multiple tasks, it's possible for one task to set an environment variable that would then be interpolated in the service definition for another task if that task did not have that environment variable. Only the service definition interpolation is impacted. This does not leak env vars across running tasks, as each taskrunner has its own builder.

  To fix this, we move the `UpdateTask` method off the builder and onto the taskenv as the `WithTask` method. This makes a shallow copy of the taskenv with a deep clone of the environment map used for interpolation, and then overwrites the environment from the task.

* The checks hook interpolates Nomad native service checks only on `Prerun` and not on `Update`. This could cause unexpected deregistration and registration of checks during in-place updates. To fix this, we make sure we interpolate in the `Update` method.

I also bumped into an incorrectly implemented interface in the CSI hook. I've pulled that and some better guardrails out to https://github.com/hashicorp/nomad/pull/25472.

Fixes: https://github.com/hashicorp/nomad/issues/25269
Fixes: https://hashicorp.atlassian.net/browse/NET-12310
Ref: https://github.com/hashicorp/nomad/issues/25372
2025-03-24 12:05:04 -04:00
Tim Gross
ecf3d88e81 dependabot: update reviewer for website directory (#25498)
When we updated the codeowner for the website directory to include the "web
presence" group, we didn't also update the dependabot reviewer. This results in
errors in dependabot PRs.

Ref: https://github.com/hashicorp/nomad/pull/25492#issuecomment-2746105976
2025-03-24 12:03:02 -04:00
Juana De La Cuesta
2bd5dc5970 Merge pull request #25479 from hashicorp/NET-11546-enos-same-allocs
Add a test for re attaching allocs after client restart
2025-03-24 16:03:57 +01:00
Juanadelacuesta
c3258ab0f6 fix: reuse client_id when checking for running allocs 2025-03-24 15:11:33 +01:00
dependabot[bot]
26c2f6bccf chore(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (#25490)
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.1...v5.2.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-24 09:27:48 -04:00
dependabot[bot]
997d2b355f chore(deps): bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2 (#25491)
Bumps [github.com/golang-jwt/jwt/v4](https://github.com/golang-jwt/jwt) from 4.5.1 to 4.5.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v4.5.1...v4.5.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-24 09:00:16 -04:00
dependabot[bot]
c60b70573c chore(deps): bump github.com/docker/cli (#25493)
Bumps [github.com/docker/cli](https://github.com/docker/cli) from 27.5.1+incompatible to 28.0.2+incompatible.
- [Commits](https://github.com/docker/cli/compare/v27.5.1...v28.0.2)

---
updated-dependencies:
- dependency-name: github.com/docker/cli
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-24 08:57:19 -04:00
dependabot[bot]
b3114333a6 chore(deps): bump github.com/klauspost/cpuid/v2 from 2.2.9 to 2.2.10 (#25494)
Bumps [github.com/klauspost/cpuid/v2](https://github.com/klauspost/cpuid) from 2.2.9 to 2.2.10.
- [Release notes](https://github.com/klauspost/cpuid/releases)
- [Changelog](https://github.com/klauspost/cpuid/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/cpuid/compare/v2.2.9...v2.2.10)

---
updated-dependencies:
- dependency-name: github.com/klauspost/cpuid/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-24 08:55:48 -04:00
dependabot[bot]
c8804b8e8e chore(deps): bump github.com/hashicorp/go-secure-stdlib/listenerutil (#25495)
Bumps [github.com/hashicorp/go-secure-stdlib/listenerutil](https://github.com/hashicorp/go-secure-stdlib) from 0.1.9 to 0.1.10.
- [Release notes](https://github.com/hashicorp/go-secure-stdlib/releases)
- [Commits](https://github.com/hashicorp/go-secure-stdlib/compare/parseutil/v0.1.9...listenerutil/v0.1.10)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-secure-stdlib/listenerutil
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-24 08:53:57 -04:00
Juana De La Cuesta
2fdca9eb04 Merge pull request #25478 from hashicorp/NET-11546-enos-same-nfs
Give the nfs controller a longer start up time
2025-03-24 09:45:12 +01:00
Aimee Ukasick
34ae5d5ae6 Fix link rendering in server.default_scheduler_config (#25482)
CE-821
2025-03-21 12:50:57 -05:00
Tim Gross
c0c41f11a5 CSI: prevent loop in volumewatcher when node is GC'd (#25428)
If a CSI volume is has terminal allocations, the volumewatcher will submit an
`Unpublish` RPC. But the "past claim" we create is missing the "external" node
identifier (ex. the AWS EC2 instance ID). The unpublish RPC can tolerate this if
the node still exists in the state store, but if the node has been GC'd the
controller unpublish step will return an error. But at this point we've already
checkpointed the unpublish workflow, which triggers a notification on the
volumewatcher. This results in the volumewatcher getting into a tight loop of
retries. Unfortunately even if we somehow break the loop (perhaps because we hit
a different code path), we'll kick off this loop again after a leader election
when we spin up the volumewatchers again.

This changeset includes the following:
* Fix the primary bug by including the external node ID when creating a "past
  claim" for a terminal allocation.
* If we can't lookup the external ID because there's no external node ID and the
  node no longer exists, abandon it in the same way that we do the node unpublish
  step.
* Rate limit the volumewatcher loop so that any future bugs of this type don't
  cause a tight loop.
* Remove some dead code found while working on this.

Fixes: https://github.com/hashicorp/nomad/issues/25349
Ref: https://hashicorp.atlassian.net/browse/NET-12298
2025-03-21 13:07:16 -04:00
Juanadelacuesta
ce261be358 style: linter fix 2025-03-21 15:26:25 +01:00
Juanadelacuesta
b1dbc14499 func: make the csi_plugin health timeout a little longer help the test run better locally 2025-03-21 15:16:00 +01:00
Juanadelacuesta
82fcc62c46 func: add verification for allocs correctly reattaching after client restarts 2025-03-21 15:14:00 +01:00
James Rasell
0a29c2f017 ci: Use custom runner for core tests with more CPU and memory. (#25475) 2025-03-21 13:52:36 +00:00
James Rasell
27ad88ac17 test: Calculate agent endpoint scheduler count, not static. (#25473) 2025-03-21 13:47:53 +00:00
Tim Gross
c67c4ea182 client: statically assert hook interfaces in build (#25472)
While working on #25373, I noticed that the CSI hook's `Destroy` method doesn't
match the interface, which means it never gets called. Because this method only
cancels any in-flight CSI requests, the only impact of this bug is that any CSI
RPCs that are in-flight when an alloc is GC'd on the client or a dev agent is
shut down won't be interrupted gracefully.

Fix the interface, but also make static assertions for all the allocrunner hooks
in the production code, so that you can make changes to interfaces and have
compile-time assistance in avoiding mistakes.

Ref: https://github.com/hashicorp/nomad/pull/25373
2025-03-21 09:14:13 -04:00