nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Juanadelacuesta	ed150010c7	docs: remove wrong FlagsVariable parameter	2024-09-04 15:09:38 +02:00
Austin Culter	ce3e159ee8	docs: update upgrade-specific.mdx (#23906 )	2024-09-04 08:42:27 -04:00
Tim Gross	c43e30a387	WI: interpolate parent job ID in `vault.default_identity.extra_claims` (#23817 ) When we interpolate job fields for the `vault.default_identity.extra_claims` block, we forgot to use the parent job ID when that's available (as we do for all other claims). This changeset fixes the bug and adds a helper method that'll hopefully remind us to do this going forward. Also added a missing changelog entry for #23675 where we implemented the `extra_claims` block originally, which shipped in Nomad 1.8.3. Fixes: https://github.com/hashicorp/nomad/issues/23798	2024-09-03 13:56:36 -04:00
Aimee Ukasick	8407a9f442	Docs: CE-674 Add job statuses (#23849 ) * Docs: CE-674 Add job status explanation add new page for jobs to concepts section * add job types * Rename jobs; move in site nav; remove types; reformat; add scaled * change Jobs to Job on the page * fix typo * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * create UI statuses heading --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2024-08-29 11:22:12 -05:00
Aimee Ukasick	bc90bd7c68	Merge pull request #23870 from hashicorp/ce705 Docs: CE-705 Highlight that user must back up keyring separately	2024-08-26 13:36:50 -05:00
Aimee Ukasick	3d06eef65d	Docs: CE-705 Highlight that user must backup keyring separately	2024-08-26 11:25:26 -05:00
Aimee Ukasick	5c3dae9d22	Website README: Update to include installing HashiCorp package to run content-check locally Validating content section doesn't mention that you need to have the @hashicorp/platform-content-conformance installed if you want to run `npm run content-check` locally.	2024-08-23 15:17:51 -05:00
Sujata Roy	36522ec632	Merge pull request #23850 from hashicorp/Nomad-NET-9394 command/debug: capture more logs by default	2024-08-22 10:43:28 -07:00
Michael Schurter	8b0a88e2f7	docs: update defaults for operator debug	2024-08-22 09:17:03 -07:00
Florian Apolloner	d6be784e2d	namespaces: add allowed network modes to capabilities. (#23813 )	2024-08-16 09:47:19 -04:00
Piotr Kazmierczak	f8e7905e24	docs: dmidecode manual installation as post-install step (#23823 ) Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2024-08-15 17:14:16 +02:00
Tim Gross	6aa503f2bb	docker: disable cpuset management for non-root clients (#23804 ) Nomad clients manage a cpuset cgroup for each task to reserve or share CPU cores. But Docker owns its own cgroups, and attempting to set a parent cgroup that Nomad manages runs into conflicts with how runc manages cgroups via systemd. Therefore Nomad must run as root in order for cpuset management to ever be compatible with Docker. However, some users running in unsupported configurations felt that the changes we made in Nomad 1.7.0 to ensure Nomad was running correctly represented a regression. This changeset disables cpuset management for non-root Nomad clients. When running Nomad as non-root, the driver will not longer reconcile cpusets with Nomad and `resources.cores` will behave incorrectly (but the driver will still run). Although this is one small step along the way to supporting a rootless Nomad client, running Nomad as non-root is still unsupported. This PR is insufficient by itself to have a secure and properly-working rootless Nomad client. Ref: https://github.com/hashicorp/nomad/issues/18211 Ref: https://github.com/hashicorp/nomad/issues/13669 Ref: https://hashicorp.atlassian.net/browse/NET-10652 Ref: https://github.com/opencontainers/runc/blob/main/docs/systemd.md	2024-08-14 16:44:13 -04:00
Martijn Vegter	aded4b3500	docs: remove remaining references to network_speed config (#23792 )	2024-08-14 14:14:38 -04:00
Piotr Kazmierczak	c1362c03df	docs: minimal Consul policy for Nomad agents needs node:write (#23800 )	2024-08-13 17:53:21 +02:00
Tim Gross	ef116b12d5	metrics: add `client.tasks` state metrics (#23773 ) Although we have `client.allocations` metrics to track allocation states on a client, having separate metrics for `client.tasks` will allow operators to identify that there are individual tasks in an unexpected state in an otherwise healthy allocation. Fixes: https://github.com/hashicorp/nomad/issues/23770	2024-08-09 09:02:17 -04:00
VPanteleev-S7	2e5d6192a7	docs: illustrate how to use the obtained token (#19557 ) Currently, the page doesn't explain how to do things as the logged in user.	2024-08-08 15:35:26 -04:00
Kartik Prajapati	3a3e63e2e1	cli: add role update functionality to acl token update (#18532 )	2024-08-08 15:33:36 -04:00
johncooler	3214c2bd62	docs: remove duplicate config option (#23768 )	2024-08-08 08:25:25 -04:00
Aimee Ukasick	20511fa64d	docs: Clarify namespace rules matching criteria. (#23752 ) Clarify how Nomad evaluates policy rules. Fixes: #20118 Jira: https://hashicorp.atlassian.net/browse/CE-695 Related tutorial PR: https://github.com/hashicorp/tutorials/pull/2205	2024-08-07 09:28:38 -04:00
Tim Gross	4a5921cb16	acl: disallow leading `/` on variable paths (#23757 ) The path for a Variable never begins with a leading `/`, because it's stripped off in the API before it ever gets to the state store. The CLI and UI allow the leading `/` for convenience, but this can be misleading when it comes to writing ACL policies. An ACL policy with a path starting with a leading `/` will never match. Update the ACL policy parser so that we prevent an incorrect variable path in the policy. Fixes: https://github.com/hashicorp/nomad/issues/23730	2024-08-07 09:26:18 -04:00
Aimee Ukasick	021692eccf	docs: refactor CNI plugin content (#23707 ) - Pulled common content from multiple pages into new partials - Refactored install/index to be OS-based so I could add linux-distro-based instructions to install-consul-cni-plugins.mdx partial. The tab groups on the install/index page do match and change focus as expected. - Moved CNI overview-type content to networking/index - Refactored networking/cni to include install CNI plugins and configuration content (from install/index). - Moved CNI plugins explanation in bridge mode configuration section into bullet points. They had been #### headings, which aren't rendered in the R page TOC. I tried to simplify and format the bullet point content to be easier to scan. Ref: https://hashicorp.atlassian.net/browse/CE-661 Fixes: https://github.com/hashicorp/nomad/issues/23229 Fixes: https://github.com/hashicorp/nomad/issues/23583	2024-08-06 14:47:46 -04:00
Tim Gross	b25f1b66ce	resources: allow job authors to configure size of secrets tmpfs (#23696 ) On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks need larger space for downloading large secrets. This is especially the case for tasks using `templates`, which need extra room to write a temporary file to the secrets directory that gets renamed to the old file atomically. This changeset allows increasing the size of the tmpfs in the `resources` block. Because this is a memory resource, we need to include it in the memory we allocate for scheduling purposes. The task is already prevented from using more memory in the tmpfs than the `resources.memory` field allows, but can bypass that limit by writing to the tmpfs via `template` or `artifact` blocks. Therefore, we need to account for the size of the tmpfs in the allocation resources. Simply adding it to the memory needed when we create the allocation allows it to be accounted for in all downstream consumers, and then we'll subtract that amount from the memory resources just before configuring the task driver. For backwards compatibility, the default value of 1MiB is "free" and ignored by the scheduler. Otherwise we'd be increasing the allocated resources for every existing alloc, which could cause problems across upgrades. If a user explicitly sets `resources.secrets = 1` it will no longer be free. Fixes: https://github.com/hashicorp/nomad/issues/2481 Ref: https://hashicorp.atlassian.net/browse/NET-10070	2024-08-05 16:06:58 -04:00
Tim Gross	e684636aed	cli: add option to return original HCL in `job inspect` command (#23699 ) In 1.6.0 we shipped the ability to review the original HCL in the web UI, but didn't follow-up with an equivalent in the command line. Add a `-hcl` flag to the `job inspect` command. Closes: https://github.com/hashicorp/nomad/issues/6778	2024-08-05 15:35:18 -04:00
Tim Gross	bc50eebebd	workload identity: add support for extra claims config for Vault (#23675 ) Although we encourage users to use Vault roles, sometimes they're going to want to assign policies based on entity and pre-create entities and aliases based on claims. This allows them to use single default role (or at least small number of them) that has a templated policy, but have an escape hatch from that. When defining Vault entities the `user_claim` must be unique. When writing Vault binding rules for use with Nomad workload identities the binding rule won't be able to create a 1:1 mapping because the selector language allows accessing only a single field. The `nomad_job_id` claim isn't sufficient to uniquely identify a job because of namespaces. It's possible to create a JWT auth role with `bound_claims` to avoid this becoming a security problem, but this doesn't allow for correct accounting of user claims. Add support for an `extra_claims` block on the server's `default_identity` blocks for Vault. This allows a cluster administrator to add a custom claim on all allocations. The values for these claims are interpolatable with a limited subset of fields, similar to how we interpolate the task environment. Fixes: https://github.com/hashicorp/nomad/issues/23510 Ref: https://hashicorp.atlassian.net/browse/NET-10372 Ref: https://hashicorp.atlassian.net/browse/NET-10387	2024-08-05 15:01:54 -04:00
Aimee Ukasick	cbacdb2041	DOCS: CE-659 chroot limitations for isolated fork/exec driver (#23739 )	2024-08-05 14:35:54 -04:00
Tim Gross	9ff7437b06	docs: document `client.alloc_mounts_dir` configuration (#23733 ) In Nomad 1.8.0 we introduced the `alloc_mounts_dir` to support unveil filesystem isolation, but we didn't document the configuration value.	2024-08-05 11:59:47 -04:00
Tim Gross	9d4686c0df	tls: remove deprecated `prefer_server_cipher_suites` field (#23712 ) The TLS configuration object includes a deprecated `prefer_server_cipher_suites` field. In version of Go prior to 1.17, this property controlled whether a TLS connection would use the cipher suites preferred by the server or by the client. This field is ignored as of 1.17 and, according to the `crypto/tls` docs: "Servers now select the best mutually supported cipher suite based on logic that takes into account inferred client hardware, server hardware, and security." This property has been long-deprecated and leaving it in place may lead to false assumptions about how cipher suites are negotiated in connection to a server. So we want to remove it in Nomad 1.9.0. Fixes: https://github.com/hashicorp/nomad-enterprise/issues/999 Ref: https://hashicorp.atlassian.net/browse/NET-10531	2024-08-01 08:52:05 -04:00
Tim Gross	2ee6043cab	tls: support setting min version to TLS1.3 (#23713 ) Nomad already supports TLS1.3, but not as a minimum version configuration. Update our config validation to allow setting `tls_min_version` to 1.3. Update the documentation to match Vault and warn that the `tls_cipher_suites` field is ignored when TLS is 1.3 Fixes: https://github.com/hashicorp/nomad/issues/20131 Ref: https://hashicorp.atlassian.net/browse/NET-10530	2024-08-01 08:46:32 -04:00
Tim Gross	c06859e5bc	docs: add note about removing support for older clients in 1.9 (#23695 ) In Nomad 1.6.0 we started sending the node secret with RPCs that previously did not include it. We planned to deprecate the older auth workflow but didn't set a release. Removing the legacy support means that nodes running <1.6.0 will fail to heartbeat. Ref: https://hashicorp.atlassian.net/browse/NET-10009	2024-07-26 13:26:24 -04:00
Tim Gross	d5ca07a247	docs: notices of upcoming deprecations and backports (#23683 ) Add a section to the docs describing planned upcoming deprecations and removals. Also added some missing upgrade guide sections missed during the last release.	2024-07-25 10:20:18 -04:00
Tim Gross	0f4014b4a9	docs: external KMS configuration (#23600 ) In #23580 we're implementing support for encrypting Nomad's key material with external KMS providers or Vault Transit. This changeset breaks out the documentation from that PR to keep the review manageable and present it to a wider set of reviewers. Ref: https://hashicorp.atlassian.net/browse/NET-10334 Ref: https://github.com/hashicorp/nomad/issues/14852 Ref: https://github.com/hashicorp/nomad/pull/23580	2024-07-19 15:08:54 -04:00
Tim Gross	2f4353412d	keyring: support prepublishing keys (#23577 ) When a root key is rotated, the servers immediately start signing Workload Identities with the new active key. But workloads may be using those WI tokens to sign into external services, which may not have had time to fetch the new public key and which might try to fetch new keys as needed. Add support for prepublishing keys. Prepublished keys will be visible in the JWKS endpoint but will not be used for signing or encryption until their `PublishTime`. Update the periodic key rotation to prepublish keys at half the `root_key_rotation_threshold` window, and promote prepublished keys to active after the `PublishTime`. This changeset also fixes two bugs in periodic root key rotation and garbage collection, both of which can't be safely fixed without implementing prepublishing: * Periodic root key rotation would never happen because the default `root_key_rotation_threshold` of 720h exceeds the 72h maximum window of the FSM time table. We now compare the `CreateTime` against the wall clock time instead of the time table. (We expect to remove the time table in future work, ref https://github.com/hashicorp/nomad/issues/16359) * Root key garbage collection could GC keys that were used to sign identities. We now wait until `root_key_rotation_threshold` + `root_key_gc_threshold` before GC'ing a key. * When rekeying a root key, the core job did not mark the key as inactive after the rekey was complete. Ref: https://hashicorp.atlassian.net/browse/NET-10398 Ref: https://hashicorp.atlassian.net/browse/NET-10280 Fixes: https://github.com/hashicorp/nomad/issues/19669 Fixes: https://github.com/hashicorp/nomad/issues/23528 Fixes: https://github.com/hashicorp/nomad/issues/19368	2024-07-19 13:29:41 -04:00
Tim Gross	a8ab2d13b4	docs: explain how to use insecure registries with Docker (#23642 ) The documentation for the `SSL` option for the Docker driver is misleading inasmuch as it's both deprecated and non-functional in current versions of Docker. Remove this option from the docs and add a section explaining how to use insecure registries. Fixes: https://github.com/hashicorp/nomad/issues/23616	2024-07-19 11:18:47 -04:00
James Rasell	de0a86a55a	docs: Fix ACL login API path documentation. (#23624 )	2024-07-19 11:56:15 +01:00
Piotr Kazmierczak	f81b72e0bf	exec2: update documentation with oom_score_adj settings (#23327 )	2024-07-19 09:09:27 +02:00
James Rasell	a65e5c126a	docs: update quota docs and changelog to detail new cores feature. (#23592 )	2024-07-15 10:07:34 +01:00
guifran001	1c44521543	client: Add a preferred address family option for network-interface (#23389 ) to prefer ipv4 or ipv6 when deducing IP from network interface Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2024-07-12 15:30:38 -05:00
Martina Santangelo	661011f5de	cni: allow users to set CNI args in job spec (#23538 )	2024-07-12 11:47:15 -04:00
Adrian Todorov	3f2729f7f5	remove mentions of old versions of Nomad in various docs (#23567 )	2024-07-12 11:01:32 -04:00
Piotr Kazmierczak	fabae251c5	docs: correct deb and rpm registry installation instructions for podman driver (#23571 )	2024-07-12 16:55:10 +02:00
Piotr Kazmierczak	b423276986	docs: update podman driver installation instructions (#23568 ) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-07-12 16:30:55 +02:00
Deniz Onur Duzgun	c82dd76a1b	security: update tls cipher suites (#23551 )	2024-07-11 14:01:45 -04:00
Adrian Todorov	6589d7130b	docs: remove mentions of 'new in Nomad X version' where X is an older version (#23552 )	2024-07-11 13:43:28 -04:00
Piotr Kazmierczak	4212bfd669	docs: update documentation of namespace delete command (#23536 )	2024-07-10 18:31:35 +02:00
Jeff Boruszak	d3041a0e86	docs: Autoscaling agent 404 and navigation fix (#23524 )	2024-07-09 15:45:57 -04:00
Piotr Kazmierczak	d5e1515e80	docker: default to hyper-v isolation on Windows (#23452 )	2024-07-01 08:56:43 +02:00
Tim Gross	cd3101d624	scale: add `-check-index` to `job scale` command (#23457 ) The RPC handler for scaling a job passes flags to enforce the job modify index is unchanged when it makes the write to Raft. But its only checking against the existing job modify index at the time the RPC handler snapshots the state store, so it can only enforce consistency for its own validation. In clusters with automated scaling, it would be useful to expose the enforce index options to the API, so that cluster admins can enforce that scaling only happens when the job state is consistent with a state they've previously seen in other API calls. Add this option to the CLI and API and have the RPC handler check them if asked. Fixes: https://github.com/hashicorp/nomad/issues/23444	2024-06-27 16:54:06 -04:00
Piotr Kazmierczak	863d42bc4b	docs: upgrade guide updates for backported Docker windows changes (#23453 ) Upgrade guide should be uniform across all supported versions, otherwise backporting breaking changes is tedious.	2024-06-27 19:35:56 +02:00
Piotr Kazmierczak	0ece7b5c16	docker: validate that containers do not run as ContainerAdmin on Windows (#23443 ) This enables checks for ContainerAdmin user on docker images on Windows. It's only checked if users run docker with process isolation and not hyper-v, because hyper-v provides its own, proper sandboxing. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-06-27 16:22:24 +02:00
Tim Gross	df67e74615	Consul: add preflight checks for Envoy bootstrap (#23381 ) Nomad creates Consul ACL tokens and service registrations to support Consul service mesh workloads, before bootstrapping the Envoy proxy. Nomad always talks to the local Consul agent and never directly to the Consul servers. But the local Consul agent talks to the Consul servers in stale consistency mode to reduce load on the servers. This can result in the Nomad client making the Envoy bootstrap request with a tokens or services that have not yet replicated to the follower that the local client is connected to. This request gets a 404 on the ACL token and that negative entry gets cached, preventing any retries from succeeding. To workaround this, we'll use a method described by our friends over on `consul-k8s` where after creating the objects in Consul we try to read them from the local agent in stale consistency mode (which prevents a failed read from being cached). This cannot completely eliminate this source of error because it's possible that Consul cluster replication is unhealthy at the time we need it, but this should make Envoy bootstrap significantly more robust. This changset adds preflight checks for the objects we create in Consul: * We add a preflight check for ACL tokens after we login via via Workload Identity and in the function we use to derive tokens in the legacy workflow. We do this check early because we also want to use this token for registering group services in the allocrunner hooks. * We add a preflight check for services right before we bootstrap Envoy in the taskrunner hook, so that we have time for our service client to batch updates to the local Consul agent in addition to the local agent sync. We've added the timeouts to be configurable via node metadata rather than the usual static configuration because for most cases, users should not need to touch or even know these values are configurable; the configuration is mostly available for testing. Fixes: https://github.com/hashicorp/nomad/issues/9307 Fixes: https://github.com/hashicorp/nomad/issues/10451 Fixes: https://github.com/hashicorp/nomad/issues/20516 Ref: https://github.com/hashicorp/consul-k8s/pull/887 Ref: https://hashicorp.atlassian.net/browse/NET-10051 Ref: https://hashicorp.atlassian.net/browse/NET-9273 Follow-up: https://hashicorp.atlassian.net/browse/NET-10138	2024-06-27 10:15:37 -04:00

1 2 3 4 5 ...

4822 Commits