Commit Graph

26968 Commits

Author SHA1 Message Date
Juanadelacuesta
adf038b495 fix: correct the logic for LeaveOnTerm or LeaveOnInt depending on the incoming signal 2025-04-23 16:03:12 +02:00
Juanadelacuesta
b375974bc3 style: add comments 2025-04-23 15:47:37 +02:00
Juanadelacuesta
c5c4272aee func: force agent return if there is an error on reload 2025-04-23 15:14:48 +02:00
Piotr Kazmierczak
df3b00bce0 acl: use WhoAmI RPC endpoint in /acl/token/self (#25547)
ResolveToken RPC endpoint was only used by the /acl/token/self API. We should migrate to the WI-aware WhoAmI instead.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-04-22 17:53:39 +02:00
Daniel Bennett
c46521a80d cli: operator debug: respect NOMAD_REGION env var (#25716)
properly filter out regions other than the one specified
like the -namespace flag does
2025-04-21 17:06:50 -04:00
Michael Smithhisler
6036ab8b40 client: close namespace file handle and defensively lazy unmount (#25714) 2025-04-21 16:25:05 -04:00
tehut
b11619010e Add priority flag to Dispatch CLI and API (#25622)
* Add priority flag to Dispatch CLI and DispatchOpts() helper to HTTP API
2025-04-18 13:24:52 -07:00
Tim Gross
88dc842729 testing: use Docker Hub registry mirror for CI (#25703)
As of April 1, Docker Hub rate limits tightened. With only 10 pulls/hr/IP, we're
likely to encounter test failures. Switch all Docker images getting pulled from
this repository to use the HashiCorp managed registry mirror.

Note that most of our tests in `drivers/docker` don't pull from the remote
registry but load a local image, while others will need to pull from the remote
and fetch different images depending on OS/arch. Refactor the definition of test
task configuration to make it clear which is which, and de-factor some false
sharing of setup functions.

Updates the E2E tests to use that registry by configuring the Docker
daemon. This required changing out a few container images that we don't have in
the registry, but these new images are all smaller. There are a couple of tests
that still use explicitly-tagged `docker.io` images or other third-party
registries, which have been left in place.

Ref: https://hashicorp.atlassian.net/browse/NET-12233

update E2E images to those in the registry mirror

fix windows and docklog test build

fix stopsignal test

mop-up

more mop-up
2025-04-18 14:21:49 -04:00
Tim Gross
c205688857 scheduler: fix state corruption from rescheduler tracker updates (#25698)
In #12319 we fixed a bug where updates to the reschedule tracker would be
dropped if the follow-up allocation failed to be placed by the scheduler in the
later evaluation. We did this by mutating the previous allocation's reschedule
tracker. But we did this without copying the previous allocation first and then
making sure the updated copy was in the plan. This is unfortunately unsafe and
corrupts the state store on the server where the scheduler ran; it may cause a
race condition in RPC handlers and it causes the server to be out of sync with
the other servers. This was discovered while trying to make all our tests
race-free, but likely impacts production users.

Copy the previous allocation before updating the reschedule tracker, and swap
out the updated allocation in the plan. This also requires that we include the
reschedule tracker in the "normalized" (stripped-down) allocations we send to
the leader as part of a plan.

Ref: https://github.com/hashicorp/nomad/pull/12319
Fixes: https://hashicorp.atlassian.net/browse/NET-12357
2025-04-18 08:42:54 -04:00
James Rasell
c85c723336 ci: Run core tests groups workflow on amd64 and arm64 runners. (#25695) 2025-04-17 15:16:29 +01:00
James Rasell
c44f847cbb ci: Compile on Ubuntu arm64 within core test workflow. (#25692)
Nomad is released as a Linux arm64 binary, so having a compilation
step on this OS/ARCH within our core test workflow will help
ensure basic arm64 problems do not get into our release branches.
2025-04-17 07:46:49 +01:00
Tim Gross
e3845207e0 update backport assistant to 0.5.7 (#25697)
Update BPA to avoid problems when requerying for reviewers.

Ref: https://github.com/hashicorp/backport-assistant/pull/152
Ref: https://hashicorp.atlassian.net/browse/NET-11804
2025-04-16 11:39:37 -04:00
dependabot[bot]
89172cdbda chore(deps): bump dompurify from 3.1.5 to 3.2.5 in /ui (#25601)
Bumps [dompurify](https://github.com/cure53/DOMPurify) from 3.1.5 to 3.2.5.
- [Release notes](https://github.com/cure53/DOMPurify/releases)
- [Commits](https://github.com/cure53/DOMPurify/compare/3.1.5...3.2.5)

---
updated-dependencies:
- dependency-name: dompurify
  dependency-version: 3.2.5
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-16 15:46:57 +02:00
Piotr Kazmierczak
1d3f08b63d ci: correct BPA image version (#25693) 2025-04-16 11:02:30 +02:00
Tim Gross
de8bc4fead update BPA to 0.5.6 (#25691)
Fixes logging issues with the errors we've been getting for backport failures,
which will help us further diagnose the problem.

Ref: https://hashicorp.atlassian.net/browse/NET-11804
2025-04-15 16:56:15 -04:00
Tim Gross
fa40cd89dd workflow test for builds and backports (#25688)
Remove a useless comment to run a test of the build and backport workflows.

Ref: https://hashicorp.atlassian.net/browse/NET-11804
Ref: https://hashicorp.atlassian.net/browse/NET-10556
2025-04-15 16:11:17 -04:00
Juana De La Cuesta
a121129155 Merge pull request #25689 from hashicorp/NOJIRA-upgrade
Add some more debug information in case of failure
2025-04-15 21:24:10 +02:00
Juanadelacuesta
2f02c90391 func: expand on some logs to get more info in case of a failure 2025-04-15 14:37:57 -04:00
Piotr Kazmierczak
6f7d789b1d ci: disable docker build summary (#25685) 2025-04-15 17:02:46 +02:00
Piotr Kazmierczak
b26995c3d5 ci: migrate runners to ubuntu-22.04 (#25651)
* ci: migrate runners to ubuntu-22.04
* find a supported build for custom-linux-xl
2025-04-14 16:12:10 -04:00
Piotr Kazmierczak
54414e6a7c ci: pin docker/build-push-action to a TSCCR approved version (#25678) 2025-04-14 17:50:43 +02:00
Arian van Putten
d28af58cbb agent: implement sd-notify reload correctly (#25636)
First of all, we should not send the unix time, but the monotonic time.
Second of all, RELOADING= and MONOTONIC_USEC fields should be sent in
*single* message not two separate messages.

From the man page of [systemd.service](https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#Type=)

> notification message via sd_notify(3) that contains the "RELOADING=1" field in
> combination with "MONOTONIC_USEC=" set to the current monotonic time (i.e.
> CLOCK_MONOTONIC in clock_gettime(2)) in μs, formatted as decimal string.

[sd_notify](https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html)
now has code samples of the protocol to clarify.

Without these changes, if you'd set
Type=notify-reload on the agen'ts systemd unit, systemd
would kill the service due to the service not responding to reload
correctly.
2025-04-14 11:38:56 -04:00
Tim Gross
016b024f2d fix duplicate changelog entries for 1.10.0 (#25674)
In https://github.com/hashicorp/nomad/pull/25653 we updated the changelog but in
review we missed that the "unreleased" section for 1.10.0 had been left in
place. Remove that.
2025-04-14 09:24:52 -04:00
Piotr Kazmierczak
c917cc19ee chore: consolidated dependabot updates for 2025-04-14 (#25670) 2025-04-14 10:37:40 +02:00
Piotr Kazmierczak
36e91be7ee build: use nomad-builder docker image to build Nomad (#25626)
This introduces a docker image based off of ubuntu:bionic that can be used to
compile Nomad binary against glibc 2.27.

The image cannot build JS assets, which must be created before we compile the
Go binary.
2025-04-14 09:27:17 +02:00
James Rasell
85c30dfd1e test: Remove use of "mitchellh/go-testing-interface" for stdlib. (#25640)
The stdlib testing package now includes this interface, so we can
remove our dependency on the external library.
2025-04-14 07:43:49 +01:00
Aimee Ukasick
d293684d3d Update rel notes, upgrade links to point to correct previous ver (#25652) 2025-04-11 10:22:23 -05:00
Tim Gross
5c89b07f11 CI: run copywrite on PRs, not just after merges (#25658)
* CI: run copywrite on PRs, not just after merges
* fix a missing copyright header
2025-04-10 17:01:34 -04:00
Carlos Galdino
048c5bcba9 Use core ID when selecting cores (#25340)
* Use core ID when selecting cores

If the available cores are not a continuous set, the core selector might
panic when trying to select cores.

For example, consider a scenario where the available cores for the selector are the following:

    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]

This list contains 46 cores, because cores with IDs 0 and 24 are not
included in the list

Before this patch, if we requested 46 cores, the selector would panic
trying to access the item with index 46 in `cs.topology.Cores`.

This patch changes the selector to use the core ID instead when looking
for a core inside `cs.topology.Cores`. This prevents an out of bounds
access that was causing the panic.

Note: The patch is straightforward with the change. Perhaps a better
long-term solution would be to restructure the `numalib.Topology.Cores`
field to be a `map[ID]Core`, but that is a much larger change that is
more difficult to land. Also, the amount of cores in our case is
small—at most 192—so a search won't have any noticeable impact.

* Add changelog entry

* Build list of IDs inline
2025-04-10 13:04:15 -07:00
Michael Schurter
4a147db906 Merge pull request #25654 from hashicorp/cl-1.9.8-1.8.12
Add changelogs for 1.9.8+ent and 1.8.12+ent
2025-04-10 11:54:30 -07:00
Michael Schurter
6a09e7f9cd created changelog for 1.8.12+ent 2025-04-10 11:43:53 -07:00
Michael Schurter
a39743c96d created changelog for 1.9.8+ent 2025-04-10 11:43:34 -07:00
Michael Schurter
c5451cf300 Merge pull request #25635 from hashicorp/post-1.10.0-release
Post 1.10.0 release
2025-04-10 10:32:24 -07:00
Tim Gross
48f304d0ca java: only set nobody user on Unix (#25648)
In #25496 we introduced the ability to have `task.user` set for on Windows, so
long as the user ID fits a particular shape. But this uncovered a 7 year old bug
in the `java` driver introduced in #5143, where we set the `task.user` to the
non-existent Unix user `nobody`, even if we're running on Windows.

Prior to the change in #25496 we always ignored the `task.user`, so this was not
a problem. We don't set the `task.user` in the `raw_exec` driver, and the
otherwise very similar `exec` driver is Linux-only, so we never see the problem
there.

Fix the bug in the `java` driver by gating the change to the `task.user` on not
being Windows. Also add a check to the new code path that the user is non-empty
before parsing it, so that any third party drivers that might be borrowing the
executor code don't hit the same probem on Windows.

Ref: https://github.com/hashicorp/nomad/pull/5143
Ref: https://github.com/hashicorp/nomad/pull/25496
Fixes: https://github.com/hashicorp/nomad/issues/25638
2025-04-10 10:34:34 -04:00
Ranjandas
8b33584fbf Add note to root keyring remove command (#25637)
* Add note to root keyring remove command

This PR updates the documentation for the root keyring remove command to note that the full key ID must be provided for the command to function correctly.

* Move keyID explanation to usage section

---------

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-04-10 08:58:48 -05:00
James Rasell
4c4cb2c6ad agent: Fix misaligned contextual k/v logging arguments. (#25629)
Arguments passed to hclog log lines should always have an even
number to provide the expected k/v output.
2025-04-10 14:40:21 +01:00
Tim Gross
27caae2b2a api: make attempting to remove peer by address a no-op (#25599)
In Nomad 1.4.0 we removed support for Raft Protocol v2 entirely. But the
`Operator.RemoveRaftPeerByAddress` RPC handler was left in place, along with its
supporting HTTP API and command line flags. Using this API will always result in
the Raft library error "operation not supported with current protocol version".

Unfortunately it's still possible in unit tests to exercise this code path, and
these tests are quite flaky. This changeset turns the RPC handler and HTTP API
into a no-op, removes the associated command line flags, and removes the flaky
tests. I've also cleaned up the test for `RemoveRaftPeerByID` to consolidate
test servers and use `shoenig/test`.

Fixes: https://hashicorp.atlassian.net/browse/NET-12413
Ref: https://github.com/hashicorp/nomad/pull/13467
Ref: https://developer.hashicorp.com/nomad/docs/upgrade/upgrade-specific#raft-protocol-version-2-unsupported
Ref: https://github.com/hashicorp/nomad-enterprise/actions/runs/13201513025/job/36855234398?pr=2302
2025-04-10 09:19:25 -04:00
Michael Schurter
0d9b108498 cleanup 1.10.0 entry and move 1.7 out 2025-04-09 16:09:01 -07:00
hc-github-team-nomad-core
7db0bdf2de Prepare for next release 2025-04-09 16:03:21 -07:00
hc-github-team-nomad-core
71af41b4b1 Generate files for 1.10.0 release 2025-04-09 16:03:21 -07:00
hc-github-team-nomad-core
9f33796156 Prepare for next release 2025-04-09 16:03:21 -07:00
hc-github-team-nomad-core
239c5f11ee Generate files for 1.10.0 release 2025-04-09 16:03:21 -07:00
Aimee Ukasick
87aabc9af2 Docs: 1.10 release notes, some factoring, sentinel apply update (#25433)
* Docs: 1.10 release notes and upgrade factoring

* Update based on code review suggestions

* add CLI for disabling UI URL hints

* fix indentation

* nav: list release notes in reverse order

fix broken link to v1.6.x docs

* Update PKCE section from Daniel's latest PR

* update pkce per daniel's suggestion

* Add dynamic host volumes governance section from blog
2025-04-09 15:43:58 -07:00
Michael Schurter
95c914624e build: Update Go to v1.24.0 (#25623)
* build: Update Go to v1.24.0

Fixes Go CVE https://pkg.go.dev/vuln/GO-2025-3563
2025-04-08 16:07:47 -07:00
James Rasell
311a83d706 e2e: Ensure UI is enabled. (#25620)
The `ui.enabled` parameter is a non-pointer bool which means the
merge function is unable to differentiate between false and not
set. When e2e introduced the `ui.show_cli_hints` configuration
parameter, the way we merge meant the UI became disabled.
2025-04-08 13:57:29 +01:00
dependabot[bot]
b8143368d8 chore(deps): bump github.com/hashicorp/raft from 1.7.2 to 1.7.3 (#25602) 2025-04-07 22:30:21 +00:00
dependabot[bot]
d6b429aa6b chore(deps): bump github.com/miekg/dns from 1.1.62 to 1.1.64 (#25603) 2025-04-07 22:29:13 +00:00
dependabot[bot]
f72fea22a7 chore(deps): bump github.com/docker/docker (#25604) 2025-04-07 22:27:56 +00:00
dependabot[bot]
60cc55615e chore(deps): bump google.golang.org/grpc from 1.71.0 to 1.71.1 (#25605) 2025-04-07 22:26:39 +00:00
dependabot[bot]
09ebb390e2 chore(deps): bump github.com/golang/snappy from 0.0.4 to 1.0.0 (#25606) 2025-04-07 22:23:13 +00:00