27545 Commits

Author SHA1 Message Date
Michael Smithhisler
9950ef515c secrets: validate name and update client config (#26447) 2025-09-05 16:08:23 -04:00
Michael Smithhisler
68167254e8 e2e: add initial tests for secrets block (#26397) 2025-09-05 16:08:23 -04:00
Michael Smithhisler
00ef9cacab secrets: add common secrets plugins impl (#26335)
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2025-09-05 16:08:23 -04:00
Michael Smithhisler
c7a6b8b253 adds implied secrets constraint to job hook (#26328) 2025-09-05 16:08:23 -04:00
Michael Smithhisler
ac32b0864d scheduler: adds implicit constraint for secrets plugin node attributes (#26303) 2025-09-05 16:08:23 -04:00
Michael Smithhisler
6dcd155bf8 add input validation and path traversal protections (#26241)
---------

Co-authored-by: Deniz Onur Duzgun <59659739+dduzgun-security@users.noreply.github.com>
2025-09-05 16:08:23 -04:00
Tim Gross
0e9eb5ae43 dispatch: write evaluation atomically with dispatch registration (#26710)
In #8435 (shipped in 0.12.1), we updated the `Job.Register` RPC to atomically
write the eval along with the job. But this didn't get copied to
`Job.Dispatch`. Under excessive load testing we demonstrated this can result in
dispatched jobs without corresponding evals.

Update the dispatch RPC to write the eval in the same Raft log as the job
registration. Note that we don't need to version-check this change for upgrades,
because the register and dispatch RPCs share the same `JobRegisterRequestType`
Raft message, and therefore all supported server versions already look for the
eval in the FSM. If an updated leader includes the eval, older followers will
write the eval. If a non-updated leader writes the eval in a separate Raft
entry, updated followers will write those evals normally.

Fixes: https://github.com/hashicorp/nomad/issues/26655
Ref: https://hashicorp.atlassian.net/browse/NMD-947
Ref: https://github.com/hashicorp/nomad/pull/8435
2025-09-05 14:53:08 -04:00
Piotr Kazmierczak
964cc8b8ca Merge pull request #26708 from hashicorp/f-system-deployments
scheduler: system deployments
2025-09-05 18:23:41 +02:00
Piotr Kazmierczak
3e4d2b731c scheduler: changelog entry for system deployments 2025-09-05 17:52:27 +02:00
Piotr Kazmierczak
8175f275c9 tooling: add 'feature' changelog msg type for make cl (#26709) 2025-09-05 16:42:42 +02:00
Tim Gross
ce614e6b7a scheduler: upgrade block testing for system deployments (#26579)
This changeset adds system scheduler tests of various permutations of the `update`
block. It also fixes a number of bugs discovered in the process.

* Don't create deployment for in-flight rollout. If a system job is in the
  middle of a rollout prior to upgrading to a version of Nomad with system
  deployments, we'll end up creating a system deployment which might never
  complete because previously placed allocs will not be tracked. Check to see if
  we have existing allocs that should belong to the new deployment and prevent a
  deployment from being created in that case.
* Ensure we call `Copy` on `Deployment` to avoid state store corruption.
* Don't limit canary counts by `max_parallel`.
* Never create deployments for `sysbatch` jobs.

Ref: https://hashicorp.atlassian.net/browse/NMD-761
2025-09-05 10:22:42 -04:00
Piotr Kazmierczak
a083495240 system scheduler: correction to Test_computeCanaryNodes (#26707) 2025-09-05 16:20:34 +02:00
Piotr Kazmierczak
276ab8a4c6 system scheduler: keep track of previously used canary nodes (#26697)
In the system scheduler, we need to keep track which nodes were previously used
as "canary nodes" and not pick them at random, in case of previously failed
canaries or changes to the amount of canaries in the jobspec.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-09-05 15:32:08 +02:00
James Rasell
1916a16311 exec: Set LOGNAME env var on exec based drivers. (#26703)
Typically the `LOGNAME` environment variable should be set according
to the values within `/etc/passwd` and represents the name of the
logged in user. This should be set, where possible, alongside the
USER and HOME variables for all drivers that use the shared
executor and do not use a sub-shell.
2025-09-05 14:07:27 +01:00
Michael Schurter
c046e83d17 bump cronexpr from v1.1.2 -> v1.1.3 (#26700)
No functional changes. Bumping just to clear up some license
ambiguities.
2025-09-05 07:46:02 +01:00
Michael Smithhisler
85a2875183 task: adds ability to interpret values from secrets hook (#26261) 2025-09-04 15:58:03 -04:00
Michael Smithhisler
2d0ce43c47 secrets: add vault secrets provider (#26198) 2025-09-04 15:58:03 -04:00
Michael Smithhisler
20a855ea13 secrets: add secrets hook with nomad provider (#26143) 2025-09-04 15:58:03 -04:00
Michael Smithhisler
65c7f34f2d secrets: Add secrets block to job spec (#26076) 2025-09-04 15:58:03 -04:00
Daniel Bennett
9682aa2724 consul connect: allow "cni/*" network mode (#26449)
don't require "bridge" network mode when using connect{}

we document this as "at your own risk" because CNI configuration
is so flexible that we can't guarantee a user's network will work,
but Nomad's "bridge" CNI config may be used as a reference.
2025-09-04 12:29:50 -04:00
Juana De La Cuesta
2944a34b58 Reuse token if it exists on client reconnect (#26604)
Currently every time a client starts, it creates a new consul token per service or task,. This PR changes the behaviour , it persists consul ACL token to the client state and it starts by looking up a token before creating a new one.

Fixes: #20184
Fixes: #20185
2025-09-04 15:27:57 +02:00
Daniel Bennett
3ad22ddad5 e2e: ui: fix token form fill (#26692)
look, I know I misspelled "locater" in the code comment, but it's easier to acknowledge that here in this commit message than it is to push a new commit with all the test/approval machinery in github.
2025-09-03 12:11:35 -04:00
dependabot[bot]
d0db16386f chore(deps): bump github.com/stretchr/testify from 1.10.0 to 1.11.1 (#26669) 2025-09-03 15:22:58 +01:00
Piotr Kazmierczak
14e98a2420 scheduler: fix promotions of system job canaries (#26652)
This changeset adjusts the handling of allocations placement when we're
promoting a deployment, and it corrects the behavior of isDeploymentComplete,
which previously would never mark promoted deployment as complete.
2025-09-03 16:09:36 +02:00
James Rasell
269e05ba33 test: Migrate volumewatcher to must and fix racy test. (#26686)
The TestVolumeWatch_LeadershipTransition test was a little racy
and the fix required adding an eventually wrapper to the end of
the test. While doing this work, it seemed fit to move the package
to the must library also.
2025-09-03 14:21:10 +01:00
James Rasell
270ab1011e lint: Enable and fix SA9004 constant type lint errors. (#26678)
When creating constants with a custom type, each definition should
include the type definition. If only the first constant defines
this, it will have a different type to the other constants.

This change fixes occurances of this and enables SA9004 within CI
linting to catch future problems while the change is in review.
2025-09-03 07:45:29 +01:00
Chris Roberts
b856e065f2 Merge pull request #26440 from hashicorp/f-winsvc-service
Add Windows service commands and Event Log support
2025-09-02 17:10:19 -07:00
Chris Roberts
c3dcdb5413 [cli] Add windows service commands (#26442)
Adds a new `windows` command which is available when running on
a Windows hosts. The command includes two new subcommands:

* `service install`
* `service uninstall`

The `service install` command will install the called binary into
the Windows program files directory, create a new Windows service,
setup configuration and data directories, and register the service
with the Window eventlog. If the service and/or binary already
exist, the service will be stopped, service and eventlog updated
if needed, binary replaced, and the service started again.

The `service uninstall` command will stop the service, remove the
Windows service, and deregister the service with the eventlog. It
will not remove the configuration/data directory nor will it remove
the installed binary.
2025-09-02 16:40:35 -07:00
Chris Roberts
61c36bdef7 [winsvc] Add support for Windows Eventlog (#26441)
Defines a `winsvc.Event` type which can be sent using the `winsvc.SendEvent`
function. If nomad is running on Windows and can send to the Windows
Eventlog the event will be sent. Initial event types are defined for
starting, ready, stopped, and log message.

The `winsvc.EventLogger` provides an `io.WriteCloser` that can be included
in the logger's writers collection. It will extract the log level from
log lines and write them appropriately to the eventlog. The eventlog
only supports error, warning, and info levels so messages with other
levels will be ignored.

A new configuration block is included for enabling logging to the
eventlog. Logging must be enabled with the `log_level` option and
the `eventlog.level` value can then be of the same or higher severity.
2025-09-02 16:40:31 -07:00
Chris Roberts
48d91dc1f9 [winsvc] Add interfaces for Windows services and service manager
Provides interfaces to the Windows service manager and Windows
services. These interfaces support creating new Windows services,
deleting Windows services, configuring Windows services, and
registering/deregistering services with Windows Eventlog.

A path helper is included to support expansion of paths using a
subset of known folder IDs.

A privileged helper is included to check that the process is
currently being executed with elevated privileges, which are
required for managing Windows services and modifying the registry.
2025-09-02 16:39:45 -07:00
boruszak
10658a9391 Syntax fix 2025-09-02 12:44:48 -07:00
dependabot[bot]
df3c74ff55 chore(deps): bump github.com/aws/aws-sdk-go-v2/config (#26668) 2025-09-02 14:23:53 +01:00
dependabot[bot]
242cbed90d chore(deps): bump google.golang.org/grpc from 1.74.2 to 1.75.0 (#26670) 2025-09-02 12:25:46 +01:00
dependabot[bot]
3840ae63c0 chore(deps): bump github.com/docker/cli (#26666) 2025-09-02 11:32:51 +01:00
James Rasell
cddc1b0127 config: Validate keyring config to catch invalid provider types. (#26673) 2025-09-02 11:07:49 +01:00
James Rasell
267dc72f4e e2e: Correctly handle IMDSv2 when discovering UI proxy address. (#26674)
The call to IMDSv1 has been failing since we switched to v2 which
meant the UI e2e script attempted to use the service IP address
for its tests. The service IP address is the Nomad client's
private address which is not routable from the e2e test runner
which means the test fails.

This change updates the IP discovery to use IMDSv2 which means the
address is correctly populated and routable. The change also makes
this discovery method by a job action within the proxy job. This
exercises that feature and utilizes it in a way for which it was
designed.
2025-09-02 11:02:48 +01:00
James Rasell
ab2a25018a deps: Update github.com/ulikunitz/xz to v0.5.15 (#26671) 2025-09-02 10:21:42 +01:00
dependabot[bot]
5f09631efe chore(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds (#26667) 2025-09-01 08:45:40 +01:00
James Rasell
d5f2c0201e e2e: Wait for keyring before starting client intro client agents. (#26660)
Ensuring the keyring is ready before starting the Nomad client in
the client intro e2e test speeds up execution. This is because the
client does not have to wait to retry failed registrations due to
the keyring not being ready.
2025-09-01 07:32:40 +01:00
tehut
87be37e8cc nmd 940/pnpm related build failures (#26659)
* replace yarn with pnpm in build scripts
* pin node version to v20
* pin pnpm version to pnpm@10.15.0
2025-08-29 09:34:58 -07:00
dependabot[bot]
5f1eb5c552 chore(deps): bump github.com/ulikunitz/xz from 0.5.12 to 0.5.14 (#26658) 2025-08-29 09:30:11 +01:00
James Rasell
6bd8bc6c0c e2e: Add client intro test to exercise strict enforcement (#26648) 2025-08-29 08:40:48 +01:00
James Rasell
07bd1de72e e2e: Update UI playwright container image to v1.55.0 (#26650) 2025-08-28 16:41:57 +01:00
Piotr Kazmierczak
8b8e21dc0e scheduler: check if system job deploy is complete before other guards (#26651) 2025-08-28 17:29:13 +02:00
Piotr Kazmierczak
de342ee48b scheduler: correct dstate total/canary counts for system deployments (#26641) 2025-08-28 16:24:52 +02:00
James Rasell
9e893ef2ad e2e: Add Client Intro test framework and initial test. (#26639)
The new client intro test mimics the Consul and Vault compat tests
and uses local agents to perform the required setup. This method
allows us the flexibility moving forward to test when enforcement
mode is in strict.

The test suite will now be triggered from the test-e2e CI run
and can also be called by a make target.
2025-08-28 09:53:07 +01:00
James Rasell
9d1d5f2f03 csi: Correctly sort IDs when listing controller plugin clients. (#26640) 2025-08-28 08:05:58 +01:00
Michael Smithhisler
485356c3d3 csi: fix volume registration error (#26642) 2025-08-27 15:00:16 -04:00
Tim Gross
5f34867420 build: fix copywrite configuration file syntax (#26644)
Because the Enterprise code has a set of copywrite exclusion entries below the
one listed here in CE, we need to make sure that the last CE line in the
configuration file ends in a comma.
2025-08-27 14:15:24 -04:00
Chris Roberts
fd1e40537c [artifact] add artifact inspection after download (#26608)
This adds artifact inspection after download to detect any issues
with the content fetched. Currently this means checking for any
symlinks within the artifact that resolve outside the task or
allocation directories. On platforms where lockdown is available
(some Linux) this inspection is not performed.

The inspection can be disabled with the DisableArtifactInspection
option. A dedicated option for disabling this behavior allows
the DisableFilesystemIsolation option to be enabled but still
have artifacts inspected after download.
2025-08-27 10:37:34 -07:00