Adds an additional check in the Keyring.Delete RPC to make sure we're not
trying to delete a key that's been used to encrypt a variable. It also adds a
-force flag for the CLI/API to sidestep that check.
* Docs: Update CLI job tag unset
CLI help order was wrong, so updating the docs.
* change usage to [options]. Move general options into expanable.
* change "to see" to "for"
* Add language from CLI help to job revert for version|tag
* Add CLI job tag subcommand page
* Add API create delete tag
Examples use same names between CLI and API
* Update CLI revert, tag; API jobs
* Add job version content
* add tag name unique per job to CLI/API; address Phil's feedback
Add partial explaining why tag, add to CLI/API
* Add diff_version to API jobs list job versions
* Apply suggestions from code review
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* remove tutorial links since not published yet.
---------
Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>
* TaggedVersion information in structs, rather than job_endpoint (#23841)
* TaggedVersion information in structs, rather than job_endpoint
* Test for taggedVersion description length
* Some API plumbing
* Tag and Untag job versions (#23863)
* Tag and Untag at API level on down, but am I unblocking the wrong thing?
* Code and comment cleanup
* Unset methods generally now I stare long into the namespace abyss
* Namespace passes through with QueryOptions removed from a write requesting struct
* Comment and PR review cleanup
* Version back to VersionStr
* Generally consolidate unset logic into apply for version tagging
* Addressed some PR comments
* Auth check and RPC forwarding
* uint64 instead of pointer for job version after api layer and renamed copy
* job tag command split into apply and unset
* latest-version convenience handling moved to CLI command level
* CLI tests for tagging/untagging
* UI parts removed
* Add to job table when unsetting job tag on latest version
* Vestigial no more
* Compare versions by name and version number with the nomad history command (#23889)
* First pass at passing a tagname and/or diff version to plan/versions requests
* versions API now takes compare_to flags
* Job history command output can have tag names and descriptions
* compare_to to diff-tag and diff-version, plus adding flags to history command
* 0th version now shows a diff if a specific diff target is requested
* Addressing some PR comments
* Simplify the diff-appending part of jobVersions and hide None-type diffs from CLI
* Remove the diff-tag and diff-version parts of nomad job plan, with an eye toward making them a new top-level CLI command soon
* Version diff tests
* re-implement JobVersionByTagName
* Test mods and simplification
* Documentation for nomad job history additions
* Prevent pruning and reaping of TaggedVersion jobs (#23983)
tagged versions should not count against JobTrackedVersions
i.e. new job versions being inserted should not evict tagged versions
and GC should not delete a job if any of its versions are tagged
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* [ui] Version Tags on the job versions page (#24013)
* Timeline styles and their buttons modernized, and tags added
* styled but not yet functional version blocks
* Rough pass at edit/unedit UX
* Styles consolidated
* better UX around version tag crud, plus adapter and serializers
* Mirage and acceptance tests
* Modify percy to not show time-based things
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* Job revert command and API endpoint can take a string version tag name (#24059)
* Job revert command and API endpoint can take a string version tag name
* RevertOpts as a signature-modified alternative to Revert()
* job revert CLI test
* Version pointers in endpoint tests
* Dont copy over the tag when a job is reverted to a version with a tag
* Convert tag name to version number at CLI level
* Client method for version lookup by tag
* No longer double-declaring client
* [ui] Add tag filter to the job versions page (#24064)
* Rough pass at the UI for version diff dropdown
* Cleanup and diff fetching via adapter method
* TaggedVersion now VersionTag (#24066)
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
In #23977 we moved the keyring into Raft, which can expose key material in Raft
snapshots when using the less-secure AEAD keyring instead of KMS. This changeset
adds tools for redacting this material from snapshots:
* The `operator snapshot state` command gains the ability to display key
metadata (only), which respects the `-filter` option.
* The `operator snapshot save` command gains a `-redact` option that removes key
material from the snapshot after it's downloaded.
* A new `operator snapshot redact` command allows removing key material from an
existing snapshot.
In 1.6.0 we shipped the ability to review the original HCL in the web UI, but
didn't follow-up with an equivalent in the command line. Add a `-hcl` flag to
the `job inspect` command.
Closes: https://github.com/hashicorp/nomad/issues/6778
When a root key is rotated, the servers immediately start signing Workload
Identities with the new active key. But workloads may be using those WI tokens
to sign into external services, which may not have had time to fetch the new
public key and which might try to fetch new keys as needed.
Add support for prepublishing keys. Prepublished keys will be visible in the
JWKS endpoint but will not be used for signing or encryption until their
`PublishTime`. Update the periodic key rotation to prepublish keys at half the
`root_key_rotation_threshold` window, and promote prepublished keys to active
after the `PublishTime`.
This changeset also fixes two bugs in periodic root key rotation and garbage
collection, both of which can't be safely fixed without implementing
prepublishing:
* Periodic root key rotation would never happen because the default
`root_key_rotation_threshold` of 720h exceeds the 72h maximum window of the FSM
time table. We now compare the `CreateTime` against the wall clock time instead
of the time table. (We expect to remove the time table in future work, ref
https://github.com/hashicorp/nomad/issues/16359)
* Root key garbage collection could GC keys that were used to sign
identities. We now wait until `root_key_rotation_threshold` +
`root_key_gc_threshold` before GC'ing a key.
* When rekeying a root key, the core job did not mark the key as inactive after
the rekey was complete.
Ref: https://hashicorp.atlassian.net/browse/NET-10398
Ref: https://hashicorp.atlassian.net/browse/NET-10280
Fixes: https://github.com/hashicorp/nomad/issues/19669
Fixes: https://github.com/hashicorp/nomad/issues/23528
Fixes: https://github.com/hashicorp/nomad/issues/19368
The RPC handler for scaling a job passes flags to enforce the job modify index
is unchanged when it makes the write to Raft. But its only checking against the
existing job modify index at the time the RPC handler snapshots the state store,
so it can only enforce consistency for its own validation.
In clusters with automated scaling, it would be useful to expose the enforce
index options to the API, so that cluster admins can enforce that scaling only
happens when the job state is consistent with a state they've previously seen in
other API calls. Add this option to the CLI and API and have the RPC handler
check them if asked.
Fixes: https://github.com/hashicorp/nomad/issues/23444
Our documentation for the `node drain` command doesn't include a treatment of
batch jobs, which are not migrated. The user is left to piece this behavior
together from the `migrate` documentation and the tutorial. Instead, let's
explicitly list the behaviors per job type.
Fixes: https://github.com/hashicorp/nomad/issues/17563
When setting up auth methods for Consul and Vault in production environments, we
can typically assume that the CA certificate for the JWKS endpoint will be in
the host certificate store (as part of the usual configuration management
cluster admins needs to do). But for quick demos with `-dev` agents, this won't
be the case.
Add a `-jwks-ca-file` parameter to the setup commands so that we can use this
tool to quickly setup WI with `-dev` agents running TLS.
The `nomad operator debug` command saves a CPU profile for each interval, and
names these files based on the interval.
The same functions takes a goroutine profile, heap profile, etc. but is missing
the logic to interpolate the file name with the interval. This results in the
operator debug command making potentially many expensive profile requests, and
then overwriting the data. Update the command to save every profile it scrapes,
and number them similarly to the existing CPU profile.
Additionally, the command flags for `-pprof-interval` and `-pprof-duration` were
validated backwards, which meant that we always coerced the `-pprof-interval` to
be the same as the `-pprof-duration`, which always resulted in a single profile
being taken at the start of the bundle. Correct the check as well as change the
defaults to be more sensible.
Fixes: https://github.com/hashicorp/nomad/issues/20151
Our documentation has a hidden assumption that users know that federation
replication requires ACLs to be enabled and bootstrapped. Add notes at some of
the places users are likely to look for it.
A separate follow-up PR to the federation tutorial should point to the ACL
multi-region tutorial as well.
Fixes: https://github.com/hashicorp/nomad/issues/20128
The new `nomad setup vault -check` commmand can be used to retrieve
information about the changes required before a cluster is migrated from
the deprecated legacy authentication flow with Vault to use only
workload identities.
The `operator snapshot` commands and agent don't back up Nomad's key
material. Add some warnings about this to places where users might be looking
for information on cluster recovery.
Fixes: https://github.com/hashicorp/nomad/issues/19389
The `-dev-consul` and `-dev-vault` flags add default identities and
configuration to the Nomad agent to connect and use the workload
identity integration with Consul and Vault.
* API command and jobspec docs
* PR comments addressed
* API docs for job/jobid/action socket
* Removing a perhaps incorrect origin of job_id across the jobs api doc
* PR comments addressed
The `-reschedule` flag stops allocations and assumes the Nomad scheduler
will create new allocations to replace them. But this is only true for
service and batch jobs.
Restarting non-service jobs with the `-reschedule` flag causes the
command to loop forever waiting for the allocations to be replaced,
which never happens.
Allocations for system jobs may be replaced by triggering an evaluation
after each stop to cause the reconciler to run again.
Sysbatch jobs should not be allowed to be rescheduled as they are never
replaced by the scheduler.