Commit Graph

23331 Commits

Author SHA1 Message Date
Seth Hoenig
ab2e6e87b8 jobspec: ensure service uniqueness in job validation 2022-07-20 12:38:08 -05:00
Seth Hoenig
87ef5178d1 cleanup: track task names and providers using set 2022-07-20 11:48:36 -05:00
Seth Hoenig
31dcdb1843 Merge pull request #13859 from hashicorp/exp-use-set
cleanup: example refactoring out map[string]struct{} using set.Set
2022-07-20 11:02:18 -05:00
Seth Hoenig
d2c9ad8567 cleanup: tweaks from cr feedback 2022-07-20 10:42:35 -05:00
Seth Hoenig
b8a7ee9c2a cleanup: example refactoring out map[string]struct{} using set.Set
This PR is a little demo of using github.com/hashicorp/go-set to
replace the use of map[T]struct{} as a make-shift set.
2022-07-19 22:50:49 -05:00
Tim Gross
b07f567831 secure vars: rename automatically accessible vars path for jobs (#13848)
Tasks are automatically granted access to variables on a path that matches their
workload identity, with a well-known prefix. Change the prefix to `nomad/jobs`
to allow for future prefixes like `nomad/volumes` or `nomad/plugins`. Reserve
the prefix by emitting errors during validation.
2022-07-19 16:17:34 -04:00
dependabot[bot]
df93355f98 build(deps): bump @percy/cli from 1.1.0 to 1.6.1 in /ui (#13724)
Bumps [@percy/cli](https://github.com/percy/cli/tree/HEAD/packages/cli) from 1.1.0 to 1.6.1.
- [Release notes](https://github.com/percy/cli/releases)
- [Commits](https://github.com/percy/cli/commits/v1.6.1/packages/cli)

---
updated-dependencies:
- dependency-name: "@percy/cli"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-19 14:09:11 -04:00
Luiz Aoqui
208b682211 docs: update Autoscaler AWS plugin with new ws_credential_provider config (#13779) 2022-07-19 10:27:55 -04:00
Phil Renaud
b1a9207de9 Prettier-applied lint rules for secure variables test (#13841) 2022-07-19 09:33:53 -04:00
Niklas Hambüchen
d18df07ccb docs: job-specification: Explain that priority has no effect on run order (#13835)
Makes the issues from #9845 and #12792 less surprising to the user.
2022-07-19 08:55:29 -04:00
Andy Assareh
7e29f64aec word typo digestible (#13772) 2022-07-19 09:00:52 +02:00
Phil Renaud
a7025e6ca4 Visual Diff tests for Secure Variables (#13689)
* A smattering of snapshot tests for Secure Variables

* Percy imports and linting
2022-07-18 17:00:45 -04:00
Tim Gross
9457a13c7c fsm: one-time token expiration should be deterministic (#13737)
When applying a raft log to expire ACL tokens, we need to use a
timestamp provided by the leader so that the result is deterministic
across servers. Use leader's timestamp from RPC call
2022-07-18 14:19:29 -04:00
Seth Hoenig
bd462ebc5f Merge pull request #13813 from hashicorp/docs-move-checks
docs: move checks into own page
2022-07-18 12:27:43 -05:00
Seth Hoenig
e12a0e763e docs: move checks into own page
This PR creates a top-level 'check' page for job-specification docs.

The content for checks is about half the content of the service page, and
is about to increase in size when we add docs about Nomad service checks.
Seemed like a good idea to just split the checks section out into its own
thing (e.g. check_restart is already a topic).

Doing the move first lets us backport this change without adding Nomad service
check stuff yet.

Mostly just a lift-and-shift but with some tweaked examples to de-emphasize
the use of script checks.
2022-07-18 09:34:55 -05:00
Tim Gross
5c0ef26299 docs: ACL policy spec reference (#13787)
The "Secure Nomad with Access Control" guide provides a tutorial for
bootstrapping Nomad ACLs, writing policies, and creating tokens. Add a reference
guide just for the ACL policy specification.
2022-07-18 09:35:28 -04:00
Seth Hoenig
8e25502ab5 Merge pull request #13786 from hashicorp/b-metrics-for-classless-blocked-evals
metrics: classless blocked evals get metrics
2022-07-18 07:34:29 -05:00
Luiz Aoqui
cd047cdc03 docs: update Podman docs to v0.4.0 (#13783) 2022-07-15 18:01:35 -04:00
Michael Schurter
875cf8db51 Improve metrics reference documentation (#13769)
* docs: tighten up parameterized job metrics docs

* docs: improve alloc status descriptions

Remove `nomad.client.allocations.start` as it doesn't exist.
2022-07-15 14:22:39 -07:00
Kyle Penfound
98bd846aa9 packaging: restart nomad service after package update (#13773) 2022-07-15 14:20:04 -07:00
Seth Hoenig
582a8a9362 metrics: even classless blocked evals get metrics
This PR fixes a bug where blocked evaluations with no class set would
not have metrics exported at the dc:class scope.

Fixes #13759
2022-07-15 14:12:44 -05:00
Tim Gross
7967c65dd2 keyring: fix flake in replication-after-election test (#13749)
The test for simulating a key rotation across leader elections was
flaky because we weren't waiting for a leader election and was
checking the server configs rather than raft for which server was
currently the leader. Fixing the flake revealed a bug in the test that
we weren't ensuring the new leader was running its own replication, so
it wouldn't pick up the key material from the previous follower.
2022-07-15 11:09:09 -04:00
Tim Gross
573aa4519e secure vars: updates should reduce quota tracking if smaller (#13742)
When secure variables are updated, we were adding the update to the
existing quota tracking without first checking whether it was an
update to an existing variable. In that case we need to add/subtract
only the difference between the new and existing quota usage.
2022-07-15 11:08:53 -04:00
Seth Hoenig
99a215cd60 Merge pull request #13771 from hashicorp/e2e-nsd-simple-lb
e2e: add nsd simple load balancing test
2022-07-15 08:48:19 -05:00
Seth Hoenig
2d83f130fe e2e: add nsd simple load balancing test 2022-07-14 15:07:19 -05:00
Michael Schurter
717e92d3aa docs: clarify blocked_evals metrics (#13751)
Related to #13740

- blocked_evals.total_blocked is the number of evals blocked for *any*
  reason
- blocked_evals.total_quota_limit is the number of evals blocked by
  quota limits, but critically: their resources are *not* counted in the
  cpu/memory
2022-07-14 11:32:33 -07:00
Tim Gross
d87052a9e1 search: refactor OSS/ENT split for ACL checks (#13760)
The split between OSS/ENT in ACL checks for the Search RPC has a lot
of repeated code that results in merge conflicts. Move most of the
logic into the shared code so that we can call out to thin functions
for ENT checks.
2022-07-14 11:31:08 -04:00
Luiz Aoqui
b7cfbd8e9e Merge pull request #13752 from hashicorp/post-1.3.2-release
Post 1.3.2 release
2022-07-14 10:38:52 -04:00
Seth Hoenig
22d7075457 Merge pull request #13716 from hashicorp/docs-update-consul-warning
docs: remove consul 1.12.0 warning
2022-07-14 08:45:56 -05:00
Tim Gross
3636340f3e keyring: upserting key metadata in FSM must be deterministic (#13733) 2022-07-14 08:38:14 -04:00
Luiz Aoqui
404232661b Merge release 1.3.2 files 2022-07-13 19:35:54 -04:00
hc-github-team-nomad-core
7aeeac1b31 Prepare for next release 2022-07-13 19:34:32 -04:00
hc-github-team-nomad-core
5f8889d522 Generate files for 1.3.2 release 2022-07-13 19:33:41 -04:00
Tim Gross
2fb328249c tests: add a space between node name and timestamp (#13750) 2022-07-13 16:23:03 -04:00
dependabot[bot]
b75beae852 chore(deps): bump github.com/mitchellh/mapstructure from 1.4.3 to 1.5.0 in /api (#12725)
* chore(deps): bump github.com/mitchellh/mapstructure in /api

Bumps [github.com/mitchellh/mapstructure](https://github.com/mitchellh/mapstructure) from 1.4.3 to 1.5.0.
- [Release notes](https://github.com/mitchellh/mapstructure/releases)
- [Changelog](https://github.com/mitchellh/mapstructure/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitchellh/mapstructure/compare/v1.4.3...v1.5.0)

---
updated-dependencies:
- dependency-name: github.com/mitchellh/mapstructure
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Also bump mapstructure in main go.mod

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-07-13 11:57:16 -07:00
Michael Schurter
d857be3c45 http: only log alloc/exec errors when non-nil (#13730) 2022-07-13 09:44:51 -07:00
Phil Renaud
404b47a54c 13553 secure vars linked from jobs (#13708)
* Vars from job prototype

* singular linked variable from job

* Links from task groups and tasks to their variables incl periodic and parameterized

* Lintfix

* Make sure they can list em before we list em

* Tests from job/group/task to var
2022-07-13 11:40:13 -04:00
dependabot[bot]
6e0eb786f9 build(deps): bump github.com/gorilla/websocket from 1.4.2 to 1.5.0 in /api (#12075)
* build(deps): bump github.com/gorilla/websocket in /api

Bumps [github.com/gorilla/websocket](https://github.com/gorilla/websocket) from 1.4.2 to 1.5.0.
- [Release notes](https://github.com/gorilla/websocket/releases)
- [Commits](https://github.com/gorilla/websocket/compare/v1.4.2...v1.5.0)

---
updated-dependencies:
- dependency-name: github.com/gorilla/websocket
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* deps: also bump websocket dep in main binary

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-07-12 16:49:31 -07:00
dependabot[bot]
1b6f9170c3 build(deps): bump github.com/docker/distribution (#12246)
Bumps [github.com/docker/distribution](https://github.com/docker/distribution) from 2.7.1+incompatible to 2.8.1+incompatible.
- [Release notes](https://github.com/docker/distribution/releases)
- [Commits](https://github.com/docker/distribution/compare/v2.7.1...v2.8.1)

---
updated-dependencies:
- dependency-name: github.com/docker/distribution
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-12 16:31:56 -07:00
Michael Schurter
be2262eb22 Add semgrep rule to catch non-determinism in FSM (#13725)
See `message:` in rule for details.

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-07-12 15:44:24 -07:00
Luiz Aoqui
d456cc1e7f Track plan rejection history and automatically mark clients as ineligible (#13421)
Plan rejections occur when the scheduler work and the leader plan
applier disagree on the feasibility of a plan. This may happen for valid
reasons: since Nomad does parallel scheduling, it is expected that
different workers will have a different state when computing placements.

As the final plan reaches the leader plan applier, it may no longer be
valid due to a concurrent scheduling taking up intended resources. In
these situations the plan applier will notify the worker that the plan
was rejected and that they should refresh their state before trying
again.

In some rare and unexpected circumstances it has been observed that
workers will repeatedly submit the same plan, even if they are always
rejected.

While the root cause is still unknown this mitigation has been put in
place. The plan applier will now track the history of plan rejections
per client and include in the plan result a list of node IDs that should
be set as ineligible if the number of rejections in a given time window
crosses a certain threshold. The window size and threshold value can be
adjusted in the server configuration.

To avoid marking several nodes as ineligible at one, the operation is rate
limited to 5 nodes every 30min, with an initial burst of 10 operations.
2022-07-12 18:40:20 -04:00
Michael Schurter
f998a2b77b core: merge reserved_ports into host_networks (#13651)
Fixes #13505

This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports).

As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility:

Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly.
Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me.
So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.
2022-07-12 14:40:25 -07:00
dependabot[bot]
7b55f7a8d0 build(deps): bump github.com/hashicorp/consul/sdk from 0.8.0 to 0.9.0 (#12007)
Bumps [github.com/hashicorp/consul/sdk](https://github.com/hashicorp/consul) from 0.8.0 to 0.9.0.
- [Release notes](https://github.com/hashicorp/consul/releases)
- [Changelog](https://github.com/hashicorp/consul/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/consul/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/consul/sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-12 12:58:34 -07:00
dependabot[bot]
68cf2dc058 build(deps): bump github.com/docker/go-units from 0.3.3 to 0.4.0 in /api (#11519)
* build(deps): bump github.com/docker/go-units from 0.3.3 to 0.4.0 in /api

Bumps [github.com/docker/go-units](https://github.com/docker/go-units) from 0.3.3 to 0.4.0.
- [Release notes](https://github.com/docker/go-units/releases)
- [Commits](https://github.com/docker/go-units/compare/v0.3.3...v0.4.0)

---
updated-dependencies:
- dependency-name: github.com/docker/go-units
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Tidy go.sum

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-07-12 12:54:56 -07:00
Michael Schurter
d5ad965857 deps: run dependabot weekly (#13723) 2022-07-12 12:50:09 -07:00
Charlie Voiselle
b949ee690c SV: CLI: var list command (#13707)
* SV CLI: var list
* Fix wildcard prefix filtering

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2022-07-12 12:49:39 -04:00
Tim Gross
06258c38e9 secure vars: fix enterprise test by upserting the namespace (#13719)
In OSS we can upsert an allocation without worrying about whether that
alloc is in a namespace that actually exists, but in ENT that upsert
will add to the namespace's quotas. Ensure we're doing so in this
secure variables RPC test to fix the test breaking in the ENT repo.
2022-07-12 12:05:52 -04:00
Charlie Voiselle
a56548fad3 SV: fixes for namespace handling (#13705)
* ACL check namespace value in SecureVariable
* Error on wildcard namespace
2022-07-12 11:15:57 -04:00
Seth Hoenig
f8c4ad8cde docs: remove consul 1.12.0 warning 2022-07-12 09:53:17 -05:00
Luiz Aoqui
49a0bc7ddc ci: remove any other versions of Node installed (#13706)
Remove other versions of Node installed in nvm to avoid issues where the
CI runner uses the wrong one.
2022-07-12 10:15:38 -04:00