Commit Graph

26575 Commits

Author SHA1 Message Date
Aimee Ukasick
af5e2a742e Docs Feature: Add clone and edit feature (#24593)
* Docs: Add clone and edit feature

CE-741

* Change clone and edit heading level

* A few work tweaks
2024-12-05 09:21:27 -06:00
Michael Smithhisler
a43f46e247 ci: only cache from github hosted runners to avoid file perm issues (#24607) 2024-12-04 12:27:26 -05:00
Tim Gross
4f3de69537 test: fix panic from concurrent writes in periodic dispatch test (#24602)
Test setup for the `TestPeriodicDispatch_Add_TriggersUpdate` test can panic if
the goroutine for the runner is running concurrently with adding the job the
second time.

Update the test as follows:
* Make a copy when mutating the job before adding it.
* Add a lock around checking if the dispatcher has a waiting eval.
* Update to use `shoenig/test` in lieu of `testify`.
2024-12-04 09:57:38 -05:00
Tim Gross
da786f64c7 helper: sanitize method on ACL token object (#24600)
There are several places where we want to redact the secret ID of an ACL token,
some of which are in the Enterprise code base for Sentinel. Add a new method
`Sanitize` that mirrors the one we have on `Node`.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2087
2024-12-03 14:02:30 -05:00
CJ
4563165196 Update sentinel.mdx (#24598) 2024-12-03 11:24:06 -05:00
Phil Renaud
4b91c17dfa [ui, ci] retain artifacts from test runs including test timing (#24555)
* retain artifacts from test runs including test timing

* Pinning commit hashes for action helpers

* trigger for ui-test run

* Trying to isolate down to a simple upload

* Once more with mkdir

* What if we just wrote our own test reporter tho

* Let the partitioned runs handle placement

* Filter out common token logs, add a summary at the end, and note failures in logtime

* Custom reporter cannot also have an output file, he finds out two days late

* Aggregate summary, duration, and removing failure case

* Conditional test report generation

* Timeouts are errors

* Trying with un-partitioned input json file

* Remove the commented-out lines for main-only runs

* combine-ui-test-results as its own script
2024-12-03 09:56:06 -05:00
Anthony
97d14c91dc Merge pull request #24588 from hashicorp/security-model-doc-fix-title
Fix doc title in security.mdx
2024-12-02 13:05:39 -05:00
CJ
b603b97d26 Update security.mdx 2024-12-02 11:43:24 -06:00
Michael Smithhisler
11ae64acb0 drivers: defer executor cleanup func to fix executor leak (#24495) 2024-12-02 12:25:32 -05:00
Daniel Bennett
e963d55ea0 release: always use service user for git ops (#24546) 2024-12-02 10:58:43 -06:00
dependabot[bot]
e5da96ee09 chore(deps): bump github.com/zclconf/go-cty-yaml from 1.0.3 to 1.1.0 (#24570)
Bumps [github.com/zclconf/go-cty-yaml](https://github.com/zclconf/go-cty-yaml) from 1.0.3 to 1.1.0.
- [Changelog](https://github.com/zclconf/go-cty-yaml/blob/master/CHANGELOG.md)
- [Commits](https://github.com/zclconf/go-cty-yaml/compare/v1.0.3...v1.1.0)

---
updated-dependencies:
- dependency-name: github.com/zclconf/go-cty-yaml
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-02 10:37:36 -06:00
dependabot[bot]
b7f24793ac chore(deps): bump github.com/docker/docker (#24471)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.1.1+incompatible to 27.3.1+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v27.1.1...v27.3.1)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-02 09:02:03 -06:00
dependabot[bot]
33f8a7a35e chore(deps): bump github.com/hashicorp/go-sockaddr from 1.0.6 to 1.0.7 (#24571)
Bumps [github.com/hashicorp/go-sockaddr](https://github.com/hashicorp/go-sockaddr) from 1.0.6 to 1.0.7.
- [Release notes](https://github.com/hashicorp/go-sockaddr/releases)
- [Commits](https://github.com/hashicorp/go-sockaddr/compare/v1.0.6...v1.0.7)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-sockaddr
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-02 08:59:04 -06:00
Michael Smithhisler
4e2d9675e7 executor: fail early on reattach if listener is not executor (#24538) 2024-12-02 09:56:00 -05:00
dependabot[bot]
b293e6a82a chore(deps): bump github.com/hashicorp/vault/api from 1.10.0 to 1.15.0 (#24572)
Bumps [github.com/hashicorp/vault/api](https://github.com/hashicorp/vault) from 1.10.0 to 1.15.0.
- [Release notes](https://github.com/hashicorp/vault/releases)
- [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG-v1.10-v1.15.md)
- [Commits](https://github.com/hashicorp/vault/compare/v1.10.0...v1.15.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/vault/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-02 08:49:38 -06:00
Phil Renaud
76e39b1c1e [ui] Volumes and plugins navigation fixes, generally (#24542)
* Volumes and plugins navigation fixes, generally

* Mirage no longer has to take the csi/ string into account

* Volume adapter test fix
2024-11-29 16:14:17 -05:00
Phil Renaud
de96c3498b [ui] "Clone and Edit" functionality for versions (#24168)
* Edit from Version functionality

* Reworked as Clone and Revert

* Change name warning for cloning as new job, version 0 checking, and erroring on sourceless clone

* If you try to plan a new version of a job with a different name of the one you're editing, suggest they might want to run a new one instead

* A few code comments and log cleanup

* Scaffolding new acceptance tests

* A whack of fun new tests

* Unit test for version number url passing on fetchRawDef

* Bit of cleanup

* fetchRawDefinition gets version support at adapter layer

* Handle spec-not-available-but-definition-is for clone-as-new-job
2024-11-29 16:12:56 -05:00
James Rasell
261359fba7 agent: Fix a bug where retry_join was not retrying. (#24561)
The retry_join logic was not allowing for retries to happen and
was exiting after the first failed discovery attempt. This change
fixes that behaviour and adds a test to ensure no further
regressions.
2024-11-29 08:29:15 +00:00
Piotr Kazmierczak
3a18f22c18 goflags: go:build linux for tests that won't compile on other platforms (#24559)
I'm a heavy LSP user and I frequently goto:next_error. This confuses my
editor on macOS.
2024-11-28 15:05:00 +01:00
dependabot[bot]
1f29a95c24 chore(deps): bump golang.org/x/sync from 0.8.0 to 0.9.0 (#24550) 2024-11-25 12:37:54 +00:00
dependabot[bot]
629e869a75 chore(deps): bump github.com/fatih/color from 1.17.0 to 1.18.0 (#24549) 2024-11-25 09:18:07 +00:00
dependabot[bot]
1f502c1d13 chore(deps): bump github.com/stretchr/testify from 1.9.0 to 1.10.0 (#24548) 2024-11-25 08:28:31 +00:00
dependabot[bot]
b290e753c3 chore(deps): bump actions/setup-node from 4.0.4 to 4.1.0 (#24300)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.0.4 to 4.1.0.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](0a44ba7841...39370e3970)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-24 09:28:17 -05:00
Piotr Kazmierczak
f7a4ded2c0 security: add CT executeTemplate to default function_denylist (#24541)
This PR adds Consul Template's executeTemplate function to the denylist by
default, in order to prevent accidental or malicious infinitely recursive
execution.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-11-22 19:33:56 +01:00
Piotr Kazmierczak
368241dbf2 security: a more comprehensive env.denylist (#24540)
A more comprehensive env.denylist that now includes more token, token file and
license variables. 

---------

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2024-11-22 18:54:18 +01:00
Yucong Sun
642e33ae41 CSI: fix topology matching logic (#24522)
Some plugins emit multiple topology segment entries for the same segment (ex. newer versions of AWS EBS) to accommodate convention changes in k8s. Check that segments are a superset instead of exactly equal to the plugin's topology segments.
2024-11-22 09:22:36 -05:00
Juana De La Cuesta
c21dfdb17a [gh-476] Sanitise HCL variables before storing on job submission (#24423)
* func: User url rules to scape non alphanumeric values in hcl variables

* docs: add changelog

* func: unscape flags before returning

* use JSON.stringify instead of bespoke value quoting to handle in-value-multi-line cases

---------

Co-authored-by: Phil Renaud <phil@riotindustries.com>
2024-11-22 11:45:02 +01:00
Martijn Vegter
997da25cdb scheduler: take all assigned cpu cores into account instead of only those part of the largest lifecycle (#24304)
Fixes a bug in the AllocatedResources.Comparable method, where the scheduler
would only take into account the cpusets of the tasks in the largest lifecycle.
This could result in overlapping cgroup cpusets. Now we make the distinction
between reserved and fungible resources throughout the lifespan of the alloc.
In addition, added logging in case of future regressions thus not requiring
manual inspection of cgroup files.
2024-11-21 13:21:48 -05:00
Juana De La Cuesta
a9e7166b6b [gh-24339] Move from streaming stats to polling for docker (#24525)
* fix: dont stream the docker stats, read them one by one

* func: add a NewSafeTicker to the herlper functions

* style: remove commented code
2024-11-21 17:36:53 +01:00
Martijn Vegter
bfb714144e client: fixed a bug where AMD CPUs were not correctly fingerprinting base speed (#24415)
Relates to: #19468
2024-11-21 09:08:47 -06:00
Piotr Kazmierczak
6ccfcc37a3 scheduler: fix a bug where force GC wasn't respected (#24456)
This PR fixes a bug where System.GarbageCollect endpoint didn't work on objects
that weren't older than their respective GC thresholds. System.GarbageCollect
is used to force garbage collection (also used by the system gc command) and
should ignore any GC threshold settings.
2024-11-21 09:07:23 +01:00
Matt McQuillan
a6fbd5a2e2 add default for codeowners file (#24517) 2024-11-20 13:23:47 -08:00
Seth Hoenig
dd396a3900 windows: revert process listing logic to that of v1.6.10 (#24494)
* windows: revert process listing logic to that of v1.6.10

In Nomad 1.7 much of the process management code was refactored, including
a rewrite of how the process tree of an executor was determined on Windows
machines. Unfortunately that rewrite has been cursed with performance issues
and bugs. Instead, revert to the logic used in v1.6.10.

* changelog
2024-11-20 11:20:20 -06:00
Tim Gross
6b9dbefb9e consul: handle nil multierror pointer correctly (#24513)
When the service client syncs to Consul, we accumulate service sync errors in a
multierror before reading all the local checks. If the API call to the local
checks fails, we either return that error or append it to the multierror and
return the set of errors. But `multierror.Error.Len()` doesn't nil-check, so we
need to do this ourselves.

I've also made a quick pass through the rest of the code base looking for
multierror `Len` method calls to see if we have this pattern elsewhere.

Fixes: https://github.com/hashicorp/nomad/issues/24512
2024-11-20 10:55:52 -05:00
James Rasell
beb4097e81 client: mark the remote_task hook as deprecated. (#24505) 2024-11-20 15:32:50 +00:00
Juana De La Cuesta
25cc492a16 docs: update the job subcommands on the docs (#24506) 2024-11-20 08:37:43 -06:00
Phil Renaud
83b30128a0 Add an image of the rendered UI block for a jobspec (#24481) 2024-11-20 09:33:47 -05:00
Phil Renaud
0023edd3ec Updates Playwright in response to an E2E nightly failure (#24487) 2024-11-20 09:33:27 -05:00
Piotr Kazmierczak
9c5078f151 agent: set content type header explicitly (#24489)
This PR addresses an XSS vulnerability where Nomad agents wouldn't explicitly
set content type headers for error responses.
2024-11-20 10:18:30 +01:00
James Rasell
11bba3dbcd docs: fix broken link within enterprise Sentinel docs. (#24486) 2024-11-20 07:43:30 +00:00
Florian Apolloner
0a343798b6 Add NOMAD_* variables to CNI args. Fixes #23830 (#24319)
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2024-11-19 12:48:48 -08:00
Aimee Ukasick
4dfedf1aef add top-level heading so the page renders correctly (#24491)
Add opening paragraph; update description
2024-11-19 11:10:10 -06:00
Phil Renaud
4708e06199 [ui] Fixes double-namespace-query-param when getting versions (#24466) 2024-11-19 10:53:37 -05:00
Tim Gross
a420732424 consul: allow non-root Nomad to rewrite token (#24410)
When a task restarts, the Nomad client may need to rewrite the Consul token, but
it's created with permissions that prevent a non-root agent from writing to
it. While Nomad clients should be run as root (currently), it's harmless to
allow whatever user the Nomad agent is running as to be able to write to it, and
that's one less barrier to rootless Nomad.

Ref: https://github.com/hashicorp/nomad/issues/23859#issuecomment-2465757392
2024-11-19 10:21:14 -05:00
James Rasell
dc501339da docs: Add federated region concept and operations pages. (#24477)
In order to help users understand multi-region federated
deployments, this change adds two new sections to the website.

The first expands the architecture page, so we can add further
detail over time with an initial federation page. The second adds
a federation operations page which goes into failure planning and
mitigation.

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2024-11-19 12:39:57 +00:00
Gabi
89c3d69d79 nsutil: wrap error that comes from the syscall so caller can do errors.As (#24480)
User of `nsutil` library should be able to do the following and for it
to work:

```
  var errno syscall.Errno
   if errors.As(err, &errno) {
       if errno == unix.EBUSY { ... }
   }
```

This commit fixes that issue.
2024-11-19 10:24:49 +01:00
Tim Gross
6be9a50626 vault: catch expired lease as fatal error (#24409)
When a Vault lease expires, it's revoked on the server and cannot be removed, so
this error should be treated as fatal.

The errors we get aren't wrapped by the Vault SDK, so unfortunately we have to
read the error messages and can't easily enumerate non-fatal error
messages (which might be bubbling up from the stdlib). I've audited the errors
currently used and have documented their source.

Ref 52ba156d47/vault/expiration.go (L1327)
Fixes: https://github.com/hashicorp/nomad/issues/23859
2024-11-18 09:12:35 -05:00
Juana De La Cuesta
270b4f97a6 Update some details of the terraform readme file for e2e provisioning (#24451)
* docs: update instructions to provision e2e cluster

* Update e2e/terraform/README.md

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>

* Update e2e/terraform/terraform.tfvars

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>

* Update e2e/terraform/README.md

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>

---------

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2024-11-18 13:36:51 +01:00
Juana De La Cuesta
1f944196d9 Allow scaling system jobs to 0 (#24363)
* func: remove validation scaling for system jobs and dont canonicalize to 1

* test: update test to validate for 0 and improve error message

* func: remove the canonicalization to 1 from system jobs

* docs: add changelog

* func: add test for scaling system jobs

* temp: add logging to debug test

* fix: clean up after test is done

* fix: scaled down jobs will still have the stop allocation, update test to account for it

* Update the e2e test to accomodate for system jobs to have an alloc per node

* fix: filter to only count ready nodes on the node count

* fix: remove the datacenter constrain from the system job definition

* fix: compare alloc IDs to avoid flaky tests when verifying no alloc was stoped

* fix: remove duplicated code
2024-11-18 13:35:47 +01:00
dependabot[bot]
3dfbc890b2 chore(deps): bump github.com/creack/pty from 1.1.23 to 1.1.24 (#24470)
Bumps [github.com/creack/pty](https://github.com/creack/pty) from 1.1.23 to 1.1.24.
- [Release notes](https://github.com/creack/pty/releases)
- [Commits](https://github.com/creack/pty/compare/v1.1.23...v1.1.24)

---
updated-dependencies:
- dependency-name: github.com/creack/pty
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-18 09:44:14 +01:00