When debugging an evaluation, you almost always want to know about all the
related evaluations and what allocations were placed by that evaluation (and
where), not just failed placements. We can enrich the command by adding the
`related` query parameter to the API, and having the command query for the
evaluations allocations automatically. Emit this data as a pair of new tables
and expose fields like quota limits, and previous/next/blocked eval without the
`-verbose` flag.
Update the docs to include the full output and remove references to long-removed
behavior of the `-json` flag.
Ref: https://hashicorp.atlassian.net/browse/NMD-818
Ref: https://go.hashi.co/rfc/nmd-212
No matter the passed region identifier, the CLI was always adding
"<role>.global.nomad" to the certificate DNS names. This is not
what we expect and has been removed.
While here, the long deprecated cluster-region flag has been
removed. This removal only impacts CLI functionality, so is safe
to do.
* Set MaxAllocations in client config
Add NodeAllocationTracker struct to Node struct
Evaluate MaxAllocations in AllocsFit function
Set up cli config parsing
Integrate maxAllocs into AllocatedResources view
Co-authored-by: Tim Gross <tgross@hashicorp.com>
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
This introduces a new HTTP endpoint (and an associated CLI command) for querying
ACL policies associated with a workload identity. It allows users that want
to learn about the ACL capabilities from within WI-tasks to know what sort of
policies are enabled.
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
Nomad Enterprise users operating in air-gapped or otherwise secured environments
don't want to send license reporting metrics directly from their
servers. Implement manual/offline reporting by periodically recording usage
metrics snapshots in the state store, and providing an API and CLI by which
cluster administrators can download the snapshot for review and out-of-band
transmission to HashiCorp.
This is the CE portion of the work required for implemention in the Enterprise
product. Nomad CE does not perform utilization reporting.
Ref: https://github.com/hashicorp/nomad-enterprise/pull/2673
Ref: https://hashicorp.atlassian.net/browse/NMD-68
Ref: https://go.hashi.co/rfc/nmd-210
* Add note to root keyring remove command
This PR updates the documentation for the root keyring remove command to note that the full key ID must be provided for the command to function correctly.
* Move keyID explanation to usage section
---------
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
* Docs: 1.10 release notes and upgrade factoring
* Update based on code review suggestions
* add CLI for disabling UI URL hints
* fix indentation
* nav: list release notes in reverse order
fix broken link to v1.6.x docs
* Update PKCE section from Daniel's latest PR
* update pkce per daniel's suggestion
* Add dynamic host volumes governance section from blog
* Docs: Fix broken links in main for 1.10 release
* Implement Tim's suggestions
* Remove link to Portworx from ecosystem page
* remove "Portworx" since Portworx 3.2 no longer supports Nomad
Describe the built-in `mkdir` plugin in the plugin concepts docs in a little
more detail. Crosslink to there from the `plugin_id` field docs, and clarify
that the `mkdir` plugin doesn't support the capacity request fields.
Update the example plugins to avoid using volume author controlled variables in
favor of Nomad-controlled ones, to reduce the risk of path traversal, and
explain to plugin authors they'll likely want to avoid this in their own
plugins.
* Basic implementation for server members and node status
* Commands for alloc status and job status
* -ui flag for most commands
* url hints for variables
* url hints for job dispatch, evals, and deployments
* agent config ui.cli_url_links to disable
* Fix an issue where path prefix was presumed for variables
* driver uncomment and general cleanup
* -ui flag on the generic status endpoint
* Job run command gets namespaces, and no longer gets ui hints for --output flag
* Dispatch command hints get a namespace, and bunch o tests
* Lots of tests depend on specific output, so let's not mess with them
* figured out what flagAddress is all about for testServer, oof
* Parallel outside of test instances
* Browser-opening test, sorta
* Env var for disabling/enabling CLI hints
* Addressing a few PR comments
* CLI docs available flags now all have -ui
* PR comments addressed; switched the env var to be consistent and scrunched monitor-adjacent hints a bit more
* ui.Output -> ui.Warn; moves hints from stdout to stderr
* isTerminal check and parseBool on command option
* terminal.IsTerminal check removed for test-runner-not-being-terminal reasons
* docs: Add note to stop allocs to make sure system allocs are not rescheduled
* Update stop.mdx
* Update website/content/docs/commands/alloc/stop.mdx
Co-authored-by: Tim Gross <tgross@hashicorp.com>
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
The legacy workflow for Vault whereby servers were configured
using a token to provide authentication to the Vault API has now
been removed. This change also removes the workflow where servers
were responsible for deriving Vault tokens for Nomad clients.
The deprecated Vault config options used byi the Nomad agent have
all been removed except for "token" which is still in use by the
Vault Transit keyring implementation.
Job specification authors can no longer use the "vault.policies"
parameter and should instead use "vault.role" when not using the
default workload identity.
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
The `-type` option for `volume status` is a UX papercut because for many
clusters there will be only one sort of volume in use. Update the CLI so that
the default behavior is to query CSI and/or DHV.
This behavior is subtly different when the user provides an ID or not. If the
user doesn't provide an ID, we query both CSI and DHV and show both tables. If
the user provides an ID, we query DHV first and then CSI, and show only the
appropriate volume. Because DHV IDs are UUIDs, we're sure we won't have
collisions between the two. We only show errors if both queries return an error.
Fixes: https://hashicorp.atlassian.net/browse/NET-12214
Our vocabulary around scheduler behaviors outside of the `reschedule` and
`migrate` blocks leaves room for confusion around whether the reschedule tracker
should be propagated between allocations. There are effectively five different
behaviors we need to cover:
* restart: when the tasks of an allocation fail and we try to restart the tasks
in place.
* reschedule: when the `restart` block runs out of attempts (or the allocation
fails before tasks even start), and we need to move
the allocation to another node to try again.
* migrate: when the user has asked to drain a node and we need to move the
allocations. These are not failures, so we don't want to propagate the
reschedule tracker.
* replacement: when a node is lost, we don't count that against the `reschedule`
tracker for the allocations on the node (it's not the allocation's "fault",
after all). We don't want to run the `migrate` machinery here here either, as we
can't contact the down node. To the scheduler, this is effectively the same as
if we bumped the `group.count`
* replacement for `disconnect.replace = true`: this is a replacement, but the
replacement is intended to be temporary, so we propagate the reschedule tracker.
Add a section to the `reschedule`, `migrate`, and `disconnect` blocks explaining
when each item applies. Update the use of the word "reschedule" in several
places where "replacement" is correct, and vice-versa.
Fixes: https://github.com/hashicorp/nomad/issues/24918
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
* Add dead (stopped) to status mapping to clarify Stopped
CE-816
* Pull status mapping into partial and include in job status command
* change `complete` to dead in table after discuss with Michael
* added clarifications; add CLI status definitions
* fixed line endings
* fixed typoce816dead
* Adds Actions to job status command output
* Adds Actions to job status command output
* Status documentation updated to show actions and formatJobActions no longer cares about pipe delineation
Adds an additional check in the Keyring.Delete RPC to make sure we're not
trying to delete a key that's been used to encrypt a variable. It also adds a
-force flag for the CLI/API to sidestep that check.
* Docs: Update CLI job tag unset
CLI help order was wrong, so updating the docs.
* change usage to [options]. Move general options into expanable.
* change "to see" to "for"
* Add language from CLI help to job revert for version|tag
* Add CLI job tag subcommand page
* Add API create delete tag
Examples use same names between CLI and API
* Update CLI revert, tag; API jobs
* Add job version content
* add tag name unique per job to CLI/API; address Phil's feedback
Add partial explaining why tag, add to CLI/API
* Add diff_version to API jobs list job versions
* Apply suggestions from code review
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* remove tutorial links since not published yet.
---------
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* TaggedVersion information in structs, rather than job_endpoint (#23841)
* TaggedVersion information in structs, rather than job_endpoint
* Test for taggedVersion description length
* Some API plumbing
* Tag and Untag job versions (#23863)
* Tag and Untag at API level on down, but am I unblocking the wrong thing?
* Code and comment cleanup
* Unset methods generally now I stare long into the namespace abyss
* Namespace passes through with QueryOptions removed from a write requesting struct
* Comment and PR review cleanup
* Version back to VersionStr
* Generally consolidate unset logic into apply for version tagging
* Addressed some PR comments
* Auth check and RPC forwarding
* uint64 instead of pointer for job version after api layer and renamed copy
* job tag command split into apply and unset
* latest-version convenience handling moved to CLI command level
* CLI tests for tagging/untagging
* UI parts removed
* Add to job table when unsetting job tag on latest version
* Vestigial no more
* Compare versions by name and version number with the nomad history command (#23889)
* First pass at passing a tagname and/or diff version to plan/versions requests
* versions API now takes compare_to flags
* Job history command output can have tag names and descriptions
* compare_to to diff-tag and diff-version, plus adding flags to history command
* 0th version now shows a diff if a specific diff target is requested
* Addressing some PR comments
* Simplify the diff-appending part of jobVersions and hide None-type diffs from CLI
* Remove the diff-tag and diff-version parts of nomad job plan, with an eye toward making them a new top-level CLI command soon
* Version diff tests
* re-implement JobVersionByTagName
* Test mods and simplification
* Documentation for nomad job history additions
* Prevent pruning and reaping of TaggedVersion jobs (#23983)
tagged versions should not count against JobTrackedVersions
i.e. new job versions being inserted should not evict tagged versions
and GC should not delete a job if any of its versions are tagged
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* [ui] Version Tags on the job versions page (#24013)
* Timeline styles and their buttons modernized, and tags added
* styled but not yet functional version blocks
* Rough pass at edit/unedit UX
* Styles consolidated
* better UX around version tag crud, plus adapter and serializers
* Mirage and acceptance tests
* Modify percy to not show time-based things
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* Job revert command and API endpoint can take a string version tag name (#24059)
* Job revert command and API endpoint can take a string version tag name
* RevertOpts as a signature-modified alternative to Revert()
* job revert CLI test
* Version pointers in endpoint tests
* Dont copy over the tag when a job is reverted to a version with a tag
* Convert tag name to version number at CLI level
* Client method for version lookup by tag
* No longer double-declaring client
* [ui] Add tag filter to the job versions page (#24064)
* Rough pass at the UI for version diff dropdown
* Cleanup and diff fetching via adapter method
* TaggedVersion now VersionTag (#24066)
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
In #23977 we moved the keyring into Raft, which can expose key material in Raft
snapshots when using the less-secure AEAD keyring instead of KMS. This changeset
adds tools for redacting this material from snapshots:
* The `operator snapshot state` command gains the ability to display key
metadata (only), which respects the `-filter` option.
* The `operator snapshot save` command gains a `-redact` option that removes key
material from the snapshot after it's downloaded.
* A new `operator snapshot redact` command allows removing key material from an
existing snapshot.
In 1.6.0 we shipped the ability to review the original HCL in the web UI, but
didn't follow-up with an equivalent in the command line. Add a `-hcl` flag to
the `job inspect` command.
Closes: https://github.com/hashicorp/nomad/issues/6778