Commit Graph

1113 Commits

Author SHA1 Message Date
Tim Gross
7997a760df docs: improve cross-links for scheduler preemption (#25203)
Fix a broken link from the preemption concepts docs to the relevant API. Also include a link to the relevant command.

Ref: #25038
2025-02-25 08:56:54 -05:00
Tim Gross
4cdfa19b1e volume status: default type to show both DHV and CSI volumes (#25185)
The `-type` option for `volume status` is a UX papercut because for many
clusters there will be only one sort of volume in use. Update the CLI so that
the default behavior is to query CSI and/or DHV.

This behavior is subtly different when the user provides an ID or not. If the
user doesn't provide an ID, we query both CSI and DHV and show both tables. If
the user provides an ID, we query DHV first and then CSI, and show only the
appropriate volume. Because DHV IDs are UUIDs, we're sure we won't have
collisions between the two. We only show errors if both queries return an error.

Fixes: https://hashicorp.atlassian.net/browse/NET-12214
2025-02-24 11:38:07 -05:00
James Rasell
32c25d3935 cli: Remove warning notes from Vault and Consul setup commands. (#25153) 2025-02-19 09:18:42 +00:00
Michael Smithhisler
ae21ae54a7 docs: add auth-methods section in acl concepts (#24917)
---------

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-02-18 12:29:44 -05:00
Tim Gross
dc58f247ed docs: clarify reschedule, migrate, and replacement terminology (#24929)
Our vocabulary around scheduler behaviors outside of the `reschedule` and
`migrate` blocks leaves room for confusion around whether the reschedule tracker
should be propagated between allocations. There are effectively five different
behaviors we need to cover:

* restart: when the tasks of an allocation fail and we try to restart the tasks
  in place.

* reschedule: when the `restart` block runs out of attempts (or the allocation
  fails before tasks even start), and we need to move
  the allocation to another node to try again.

* migrate: when the user has asked to drain a node and we need to move the
  allocations. These are not failures, so we don't want to propagate the
  reschedule tracker.

* replacement: when a node is lost, we don't count that against the `reschedule`
  tracker for the allocations on the node (it's not the allocation's "fault",
  after all). We don't want to run the `migrate` machinery here here either, as we
  can't contact the down node. To the scheduler, this is effectively the same as
  if we bumped the `group.count`

* replacement for `disconnect.replace = true`: this is a replacement, but the
  replacement is intended to be temporary, so we propagate the reschedule tracker.

Add a section to the `reschedule`, `migrate`, and `disconnect` blocks explaining
when each item applies. Update the use of the word "reschedule" in several
places where "replacement" is correct, and vice-versa.

Fixes: https://github.com/hashicorp/nomad/issues/24918
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-02-18 09:31:03 -05:00
Paweł Bęza
43885f6854 Allow for in-place update when affinity or spread was changed (#25109)
Similarly to #6732 it removes checking affinity and spread for inplace update.
Both affinity and spread should be as soft preference for Nomad scheduler rather than strict constraint. Therefore modifying them should not trigger job reallocation.

Fixes #25070
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-02-14 14:33:18 -05:00
Aimee Ukasick
f1a1ff678c Docs: Clarify Job status mapping on Job page (#25105)
* Add dead (stopped) to status mapping to clarify Stopped

CE-816

* Pull status mapping into partial and include in job status command

* change `complete` to dead in table after discuss with Michael

* added clarifications; add CLI status definitions

* fixed line endings

* fixed typoce816dead
2025-02-14 09:47:11 -06:00
Tim Gross
c2298e0999 Dynamic host volume reference documentation (#24797) 2025-02-13 12:25:58 -05:00
Jorge Marey
25426f0777 fingerprint: add config option to disable dmidecode (#25108) 2025-02-13 11:20:48 -05:00
Aimee Ukasick
35365bc1fb resolve merge conflicts 2025-02-12 11:43:21 -06:00
Aimee Ukasick
8a597a172d Docs SEO: task drivers and plugins; refactor virt section (#24783)
* Docs SEO: task drivers and plugins; refactor virt section

* add redirects for virt driver files

* Some updates. committing rather than stashing

* fix content-check errors

* Remove docs/devices/ and redirect to plugins/devices

* Update docs/drivers descriptions

* Move USB device plugin up a level. Finish descriptions.

* Apply suggestions from Jeff's code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* Apply title case suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

* apply title case suggestions; fix indentation

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-02-10 15:43:02 -06:00
stswidwinski
871585ee90 18529 nomad executes any file in plugins (#18530)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2025-02-10 16:08:22 +00:00
salehjafarli
a914888c2c docs: Corrected meta keys example from sidecar_service documentation (#25042) 2025-02-07 08:43:13 +00:00
Aimee Ukasick
5bceb3956e DHV Front matter description updates for devdot search (#25022)
* front matter description updates for devdot search; CE-812

* Apply suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-02-06 09:34:54 -06:00
Robert C. Ewing
824d362226 Fix: inaccurate docs (#25023)
Internally, sizes are always in binary units; this documentation is misleading and implies that they work in decimal units.

Without going through and replacing _every_ "MB" -> "MiB" this is the best way to hint to developers that binary sizes are used.
2025-02-04 13:42:13 -06:00
Phil Renaud
9367929d87 [cli] Adds Actions to job status command output (#24959)
* Adds Actions to job status command output

* Adds Actions to job status command output

* Status documentation updated to show actions and formatJobActions no longer cares about pipe delineation
2025-02-04 09:34:49 -05:00
Marcel Johannesmann
ec073d0eab Update acl.mdx (#25013) 2025-02-03 12:52:45 -06:00
Aimee Ukasick
d9bb241b43 Docs SEO: Update runtime, networking, Nomad vs K8s, Nomad Enterprise, upgrading, release notes, and sectionless pages (#24764)
* Docs SEO: Updates

CE-781,782,785,788

* CE-791 single pages

* CE-786 enterprise section

* CE-789 release notes

* fix content-check error

* Update description and add intro body paragraph when appropriate

* fix typo

* Apply suggestions from Jeff's code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-02-03 10:03:36 -06:00
Aimee Ukasick
03faedbc69 Docs SEO: Update Concepts for search (#24757)
* Update for search engine optimization

* Update descriptions and add intro body summary paragraph

* Apply suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-02-03 09:26:51 -06:00
Daniel Bennett
b3ecb69b5a docs: add quota storage metrics (#24998)
and reformat the whole darn table
2025-01-31 16:03:43 -06:00
Michael Smithhisler
47c14ddf28 remove remote task execution code (#24909) 2025-01-29 08:08:34 -05:00
Tim Gross
614e9067ab docs: considerations for stateful workloads updates for DHV (#24930)
We have a document describing the various approaches to storage that surveys the
landscape and makes recommendations based on the user's environment. Add dynamic
host volumes to this document.

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-01-28 16:33:22 -05:00
Daniel Bennett
5ab51e138f docs: DHV plugin specification (#24924)
and move the CSI page next to new DHV page
under /concepts/plugins/storage/
2025-01-28 16:33:22 -05:00
Tim Gross
ab3ac37bca dynamic host volumes: sentinel documentation (#24825)
Document the new `submit-host-volume` scope for Sentinel policies, which will
support Sentinel enforcement for dynamic host volumes.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-01-28 16:33:22 -05:00
Tim Gross
f5292a66c5 docs: update quota specification for new storage block (#24894)
In Nomad 1.10, quotas will use the new `storage` block to specify limits on host
volume and variables storage. Previous PRs have updated the upgrade guide noting
the deprecation of the existing `variables_limit` field.

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-01-28 16:33:22 -05:00
Tim Gross
afa71d0540 dynamic host volumes: document option for node pool governance (#24826)
For Nomad Enterprise, the namespace specification's node pool configuration can
control access to node pools for dynamic host volumes as well.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-01-28 16:33:22 -05:00
Tim Gross
140ecf3cfe fix broken link 2025-01-28 16:33:22 -05:00
Tim Gross
8330d406aa docs: dynamic host volume command line (#24814)
Dynamic host volumes use some of the same commands as CSI volumes but with
different parameters, semantics, and inputs.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-01-28 16:33:22 -05:00
Tim Gross
0489c35110 docs: dynamic host volume specification (#24810)
Dynamic host volumes use the same specification file as CSI volumes but require
a different set of parameters and have different semantics. This changeset
splits the volume specification page into separate CSI and dynamic host
volumes spec pages.

While migrating the CSI page, I've also edited it to bring it more in line with
the style guide: removed passive voice and future tense, inclusive language,
alphabetized the (chaotic!) parameters list, etc.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-01-28 16:33:22 -05:00
Tim Gross
a12c0f724e dynamic host volumes: client configuration docs (#24827)
Document the client configuration changes needed to support dynamic host
volumes. This changeset excludes the plugin specification/concepts, which will
be under a separate PR.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-01-28 16:33:20 -05:00
Tim Gross
590f50c4fe docs: jobspec updates for dynamic host volumes (#24828)
The job specification's `volume` block can claim either static or dynamic host
volumes. Update the documentation to explain this and cover the additional
fields this exposes.

Ref: https://github.com/hashicorp/nomad/pull/24797
Ref: https://hashicorp.atlassian.net/browse/NET-11482

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-01-28 16:32:57 -05:00
Tim Gross
b3151a7b02 docs: dynamic host volumes ACL policies (#24790)
Add ACL policy documentation for the dynamic host volumes feature.

Ref: https://github.com/hashicorp/nomad/issues/15489
Ref: https://hashicorp.atlassian.net/browse/NET-11482
2025-01-28 16:32:57 -05:00
Tim Gross
fe2a95f7f6 docs: dynamic host volumes API (#24789)
Add API documentation for the dynamic host volumes feature.

Ref: https://github.com/hashicorp/nomad/issues/15489
Ref: https://hashicorp.atlassian.net/browse/NET-11482
2025-01-28 16:32:57 -05:00
Daniel Bennett
49c147bcd7 dynamic host volumes: change env vars, fixup auto-delete (#24943)
* plugin env: DHV_HOST_PATH->DHV_VOLUMES_DIR
* client config: host_volumes_dir
* plugin env: add namespace+nodepool
* only auto-delete after error saving client state
  on *initial* create
2025-01-27 10:36:53 -06:00
James Rasell
ef32825ede docs: Remove Portworx state workloads link. (#24921)
Portworx website no longer has Nomad related documentation.
2025-01-24 08:45:41 +00:00
Tim Gross
33c68dcc58 docs: clarify workload-associated policy parameters (#24882)
Workload-associated ACL policies can only be set on a specific job within a
namespace, not the namespace as a whole. Clarify the documentation for the CLI
and API.

Fixes: https://github.com/hashicorp/terraform-provider-nomad/issues/500
Ref: https://github.com/hashicorp/terraform-provider-nomad/pull/504
2025-01-17 10:51:33 -05:00
Brian McClain
b4cc5d88e7 docs: update install command for Fedora to match install page (#24870) 2025-01-16 13:39:56 -05:00
James Rasell
689f935e0a services: Support TLS Skip Verify within Nomad service checks. (#24781)
Checks within a service using the Nomad provider can now utilise
the `tls_skip_verify` parameter.
2025-01-15 07:39:39 +00:00
Tim Gross
3a11a0b1e1 quotas: refactor storage limit specification (#24785)
In anticipation of having quotas for dynamic host volumes, we want the user
experience of the storage limits to feel integrated with the other resource
limits. This is currently prevented by reusing the `Resources` type instead of
having a specific type for `QuotaResources`.

Update the quota limit/usage types to use a `QuotaResources` that includes a new
storage resources quota block. The wire format for the two types are compatible
such that we can migrate the existing variables limit in the FSM.

Also fixes improper parallelism in the quota init test where we change working
directory to avoid file write conflicts but this breaks when multiple tests are
executed in the same process.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2096
2025-01-13 09:25:00 -05:00
Aimee Ukasick
ffb34319d5 Docs SEO: Update Configuration section to improve search (#24759)
* Docs SEO: Update Configuration section to improve search engine opt

CE-775

* Add enterprise only back to audit

* Update descriptions and add intro paragraph

* Fix typo

* replace "below" and "see"

* Apply suggestions from code review

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>

---------

Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
2025-01-10 11:05:23 -06:00
Mitch Pronschinske
b050c73a6d Update who-uses-nomad.mdx (#24815)
* Update who-uses-nomad.mdx

Our new contract with Roblox states that we can't mention anywhere on our sites that they use us.

* Update who-uses-nomad.mdx

Edited the sentence above the companies list to more accurately reflect them.

Also added Target to the list with a link to their case study.
2025-01-09 09:05:26 -06:00
Tim Gross
08a6f870ad cni: use check command when restoring from restart (#24658)
When the Nomad client restarts and restores allocations, the network namespace
for an allocation may exist but no longer be correctly configured. For example,
if the host is rebooted and the task was a Docker task using a pause container,
the network namespace may be recreated by the docker daemon.

When we restore an allocation, use the CNI "check" command to verify that any
existing network namespace matches the expected configuration. This requires CNI
plugins of at least version 1.2.0 to avoid a bug in older plugin versions that
would cause the check to fail.

If the check fails, destroy the network namespace and try to recreate it from
scratch once. If that fails in the second pass, fail the restore so that the
allocation can be recreated (rather than silently having networking fail).

This should fix the gap left #24650 for Docker task drivers and any other
drivers with the `MustInitiateNetwork` capability.

Fixes: https://github.com/hashicorp/nomad/issues/24292
Ref: https://github.com/hashicorp/nomad/pull/24650
2025-01-07 09:38:39 -05:00
Piotr Kazmierczak
0906f788f0 keyring: warn if removing a key that was used for encrypting variables (#24766)
Adds an additional check in the Keyring.Delete RPC to make sure we're not
trying to delete a key that's been used to encrypt a variable. It also adds a
-force flag for the CLI/API to sidestep that check.
2025-01-07 10:15:02 +01:00
Charles Z.
f7b12dc54e add noswap to secretdir tmpfs (#24645) 2025-01-06 09:44:43 -05:00
Aimee Ukasick
1c12fc59a6 Docs: change stop_after to stop_on_client_after (#24727)
* change stop_after to stop_on_client_after

CE-800  GH https://github.com/hashicorp/nomad/issues/24702

* Move disconnect entry to correct alphabetical place in nav
2024-12-19 13:13:57 -06:00
Aimee Ukasick
8dc4a94b35 Add link to published tutorial (#24712)
CE-801
2024-12-19 12:52:05 -06:00
James Rasell
7d48aa2667 client: emit optional telemetry from prerun and prestart hooks. (#24556)
The Nomad client can now optionally emit telemetry data from the
prerun and prestart hooks. This allows operators to monitor and
alert on failures and time taken to complete.

The new datapoints are:
  - nomad.client.alloc_hook.prerun.success (counter)
  - nomad.client.alloc_hook.prerun.failed (counter)
  - nomad.client.alloc_hook.prerun.elapsed (sample)

  - nomad.client.task_hook.prestart.success (counter)
  - nomad.client.task_hook.prestart.failed (counter)
  - nomad.client.task_hook.prestart.elapsed (sample)

The hook execution time is useful to Nomad engineering and will
help optimize code where possible and understand job specification
impacts on hook performance.

Currently only the PreRun and PreStart hooks have telemetry
enabled, so we limit the number of new metrics being produced.
2024-12-12 14:43:14 +00:00
Aimee Ukasick
af5e2a742e Docs Feature: Add clone and edit feature (#24593)
* Docs: Add clone and edit feature

CE-741

* Change clone and edit heading level

* A few work tweaks
2024-12-05 09:21:27 -06:00
CJ
4563165196 Update sentinel.mdx (#24598) 2024-12-03 11:24:06 -05:00
CJ
b603b97d26 Update security.mdx 2024-12-02 11:43:24 -06:00