Commit Graph

1168 Commits

Author SHA1 Message Date
Piotr Kazmierczak
4212bfd669 docs: update documentation of namespace delete command (#23536) 2024-07-10 18:31:35 +02:00
Piotr Kazmierczak
d5e1515e80 docker: default to hyper-v isolation on Windows (#23452) 2024-07-01 08:56:43 +02:00
Tim Gross
cd3101d624 scale: add -check-index to job scale command (#23457)
The RPC handler for scaling a job passes flags to enforce the job modify index
is unchanged when it makes the write to Raft. But its only checking against the
existing job modify index at the time the RPC handler snapshots the state store,
so it can only enforce consistency for its own validation.

In clusters with automated scaling, it would be useful to expose the enforce
index options to the API, so that cluster admins can enforce that scaling only
happens when the job state is consistent with a state they've previously seen in
other API calls. Add this option to the CLI and API and have the RPC handler
check them if asked.

Fixes: https://github.com/hashicorp/nomad/issues/23444
2024-06-27 16:54:06 -04:00
Piotr Kazmierczak
863d42bc4b docs: upgrade guide updates for backported Docker windows changes (#23453)
Upgrade guide should be uniform across all supported versions, otherwise
backporting breaking changes is tedious.
2024-06-27 19:35:56 +02:00
Piotr Kazmierczak
0ece7b5c16 docker: validate that containers do not run as ContainerAdmin on Windows (#23443)
This enables checks for ContainerAdmin user on docker images on Windows. It's
only checked if users run docker with process isolation and not hyper-v,
because hyper-v provides its own, proper sandboxing.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-06-27 16:22:24 +02:00
Tim Gross
df67e74615 Consul: add preflight checks for Envoy bootstrap (#23381)
Nomad creates Consul ACL tokens and service registrations to support Consul
service mesh workloads, before bootstrapping the Envoy proxy. Nomad always talks
to the local Consul agent and never directly to the Consul servers. But the
local Consul agent talks to the Consul servers in stale consistency mode to
reduce load on the servers. This can result in the Nomad client making the Envoy
bootstrap request with a tokens or services that have not yet replicated to the
follower that the local client is connected to. This request gets a 404 on the
ACL token and that negative entry gets cached, preventing any retries from
succeeding.

To workaround this, we'll use a method described by our friends over on
`consul-k8s` where after creating the objects in Consul we try to read them from
the local agent in stale consistency mode (which prevents a failed read from
being cached). This cannot completely eliminate this source of error because
it's possible that Consul cluster replication is unhealthy at the time we need
it, but this should make Envoy bootstrap significantly more robust.

This changset adds preflight checks for the objects we create in Consul:
* We add a preflight check for ACL tokens after we login via via Workload
  Identity and in the function we use to derive tokens in the legacy
  workflow. We do this check early because we also want to use this token for
  registering group services in the allocrunner hooks.
* We add a preflight check for services right before we bootstrap Envoy in the
  taskrunner hook, so that we have time for our service client to batch updates
  to the local Consul agent in addition to the local agent sync.

We've added the timeouts to be configurable via node metadata rather than the
usual static configuration because for most cases, users should not need to
touch or even know these values are configurable; the configuration is mostly
available for testing.


Fixes: https://github.com/hashicorp/nomad/issues/9307
Fixes: https://github.com/hashicorp/nomad/issues/10451
Fixes: https://github.com/hashicorp/nomad/issues/20516

Ref: https://github.com/hashicorp/consul-k8s/pull/887
Ref: https://hashicorp.atlassian.net/browse/NET-10051
Ref: https://hashicorp.atlassian.net/browse/NET-9273
Follow-up: https://hashicorp.atlassian.net/browse/NET-10138
2024-06-27 10:15:37 -04:00
Charlie Voiselle
07516c8159 [docs] Add Sentinel info to version-specific upgrade page (#23173)
The upgrade to sentinel v0.26 is a breaking change, requiring users of
custom Sentinel plugins to rebuild them using sentinel-sdk v4
2024-06-26 10:46:38 -04:00
Antti
bbdc8b7fa7 docs: add deprecation notice to cron on docs/job-specification/periodic (#23424) 2024-06-24 11:35:20 -04:00
liukch
cc7a5ed7e2 docs: Fix parameter type and default value in client reserved configuration. (#23359) 2024-06-21 16:29:59 -04:00
Heitor de Bittencourt
0588172a19 docs/jobspec: Fix "task" block placement (#23406)
The `task` block should be inside the `group` block. The example in the
page places the `task` block directly under `job`.
2024-06-21 15:21:32 -04:00
James Rasell
26d0a9169c docs: fix typo in alloc exec CLI docs page. (#23392) 2024-06-20 07:50:32 +01:00
scoss
7dcb9fcf76 add exec2 and podman to supported driver list for memory-max resource limit (#23364)
* add exevc2 and podman to supported driver list

* tweak exec2 naming

Co-authored-by: David Yu <dyu@hashicorp.com>

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
Co-authored-by: David Yu <dyu@hashicorp.com>
2024-06-18 08:26:50 -05:00
David Yu
0cc2ab5ae9 Merge pull request #23322 from hashicorp/david-yu-patch-1
docs: install `consul-cni` manually or via linux packaging
2024-06-14 11:37:46 -07:00
David Yu
36f75c5f3e Update index.mdx 2024-06-14 11:25:23 -07:00
David Yu
b2d29340b6 Update index.mdx
remove LICENSE.txt from unzip
2024-06-14 11:00:49 -07:00
David Yu
be30e130fe Update index.mdx 2024-06-14 10:57:05 -07:00
David Yu
ac2a5a851f Update index.mdx 2024-06-14 10:25:52 -07:00
David Yu
b79d813e7d Update index.mdx 2024-06-14 10:12:34 -07:00
David Yu
dea70a356e Update index.mdx 2024-06-14 10:03:17 -07:00
David Yu
f974381253 Update index.mdx 2024-06-14 09:47:36 -07:00
David Yu
26a30ac908 Update index.mdx 2024-06-14 09:42:56 -07:00
David Yu
947ecd1c77 Update website/content/docs/install/index.mdx
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-06-14 09:16:14 -07:00
Piotr Kazmierczak
85430be6dd raw_exec: oom_score_adj support (#23308) 2024-06-14 11:36:27 +02:00
David Yu
fe0e76cc3b Update index.mdx 2024-06-13 20:50:57 -07:00
David Yu
5d9d337727 Update index.mdx 2024-06-13 20:37:21 -07:00
David Yu
a08d6f5768 Update index.mdx 2024-06-13 20:10:31 -07:00
David Yu
92af6280e3 Update service-mesh.mdx 2024-06-13 20:09:53 -07:00
David Yu
92a5257d7b Update v1_8_x.mdx 2024-06-13 20:07:51 -07:00
David Yu
331f96f103 Update index.mdx 2024-06-13 20:06:39 -07:00
David Yu
51ff35bef4 docs: install consul-cni via package 2024-06-13 16:41:26 -07:00
David Yu
94bb91ab80 docs - release notes updates (#23312)
Also updated Consul compatibility matrix
2024-06-13 13:46:42 -04:00
Piotr Kazmierczak
0e8a67f0e1 docker: oom_score_adj support (#23297) 2024-06-12 10:49:20 +02:00
Tim Gross
44078d4786 docs: update configuration docs to include trace-level logging (#23285) 2024-06-11 09:19:52 -04:00
James Rasell
00570d221b docs: update ACL policy example spec to remove plugin write cap. (#23277) 2024-06-11 07:44:27 +01:00
James Rasell
1c976d126e docs: update snapshot inspect CLI detail to mirror recent changes. (#23276) 2024-06-10 14:30:13 +01:00
Tim Gross
34f34440ac build: remove 32-bit ARM builds (#23189)
We no longer intend to release 32-bit builds for any platform. We'd previously
removed the builds for i386 on both Linux and Windows, but never got around to
removing the ARM builds. Add a note about this deprecation in the release notes
for 1.8.x.
2024-06-05 15:47:20 -04:00
Tim Gross
17093d62f0 docs: describe omitted spread behavior and perf impact (#23184)
Update the documentation for the `spread` block:
* Make it clear that the default behavior within a given job when the `spread`
  block is omitted is to spread out allocs among feasible nodes.
* Describe the difference between the `spread` block and `spread` scheduler
  algorithm.
* Add warnings about the performance impact of using `spread` and how to
  mitigate it.
2024-06-05 13:28:09 -04:00
Piotr Kazmierczak
abc6fe325d docs: fix typo in nomad quota utilization metrics (#23185) 2024-06-05 16:20:44 +02:00
Tim Gross
39dee90ad4 docs: clarify node drain behavior for batch workloads (#23170)
Our documentation for the `node drain` command doesn't include a treatment of
batch jobs, which are not migrated. The user is left to piece this behavior
together from the `migrate` documentation and the tutorial. Instead, let's
explicitly list the behaviors per job type.

Fixes: https://github.com/hashicorp/nomad/issues/17563
2024-06-05 08:47:37 -04:00
James Rasell
e73d8bb114 docs: update exec2 install apt/yum commands for pre-release. (#22428) 2024-06-04 14:41:57 +01:00
Piotr Kazmierczak
2a09abc477 metrics: quota utilization configuration and documentation (#22912)
Introduces support for (optional) quota utilization metrics

CE part of the hashicorp/nomad-enterprise#1488 change
2024-06-03 21:06:19 +02:00
Piotr Kazmierczak
307fd590d7 docker: new container_exists_attempts configuration field (#22419)
This allows users to set a custom value of attempts that will be made to purge
an existing (not running) container if one is found during task creation.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-05-30 19:22:14 +02:00
James Rasell
6cb9bed236 docs: add operations benchmarking page with nomad-bench link. (#22393) 2024-05-30 07:34:10 +01:00
Michael Schurter
7048d3a482 link release notes to schedule block 2024-05-29 15:53:15 -07:00
Michael Schurter
a2fe43030c rap 2024-05-29 15:50:33 -07:00
Michael Schurter
5a0c74d1f9 Apply suggestions from code review
Co-authored-by: David Yu <dyu@hashicorp.com>
2024-05-29 15:50:33 -07:00
Michael Schurter
fe0bda9c34 speling 2024-05-29 15:50:33 -07:00
Michael Schurter
690abefc4a docs: add docs for time based task execution 2024-05-29 15:50:33 -07:00
David Yu
f083a27979 Update v1_8_x.mdx 2024-05-29 09:24:35 -07:00
David Yu
6493bc6c86 docs: Nomad 1.8 release notes (#22104) 2024-05-28 08:48:08 -04:00