* TaggedVersion information in structs, rather than job_endpoint (#23841)
* TaggedVersion information in structs, rather than job_endpoint
* Test for taggedVersion description length
* Some API plumbing
* Tag and Untag job versions (#23863)
* Tag and Untag at API level on down, but am I unblocking the wrong thing?
* Code and comment cleanup
* Unset methods generally now I stare long into the namespace abyss
* Namespace passes through with QueryOptions removed from a write requesting struct
* Comment and PR review cleanup
* Version back to VersionStr
* Generally consolidate unset logic into apply for version tagging
* Addressed some PR comments
* Auth check and RPC forwarding
* uint64 instead of pointer for job version after api layer and renamed copy
* job tag command split into apply and unset
* latest-version convenience handling moved to CLI command level
* CLI tests for tagging/untagging
* UI parts removed
* Add to job table when unsetting job tag on latest version
* Vestigial no more
* Compare versions by name and version number with the nomad history command (#23889)
* First pass at passing a tagname and/or diff version to plan/versions requests
* versions API now takes compare_to flags
* Job history command output can have tag names and descriptions
* compare_to to diff-tag and diff-version, plus adding flags to history command
* 0th version now shows a diff if a specific diff target is requested
* Addressing some PR comments
* Simplify the diff-appending part of jobVersions and hide None-type diffs from CLI
* Remove the diff-tag and diff-version parts of nomad job plan, with an eye toward making them a new top-level CLI command soon
* Version diff tests
* re-implement JobVersionByTagName
* Test mods and simplification
* Documentation for nomad job history additions
* Prevent pruning and reaping of TaggedVersion jobs (#23983)
tagged versions should not count against JobTrackedVersions
i.e. new job versions being inserted should not evict tagged versions
and GC should not delete a job if any of its versions are tagged
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* [ui] Version Tags on the job versions page (#24013)
* Timeline styles and their buttons modernized, and tags added
* styled but not yet functional version blocks
* Rough pass at edit/unedit UX
* Styles consolidated
* better UX around version tag crud, plus adapter and serializers
* Mirage and acceptance tests
* Modify percy to not show time-based things
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* Job revert command and API endpoint can take a string version tag name (#24059)
* Job revert command and API endpoint can take a string version tag name
* RevertOpts as a signature-modified alternative to Revert()
* job revert CLI test
* Version pointers in endpoint tests
* Dont copy over the tag when a job is reverted to a version with a tag
* Convert tag name to version number at CLI level
* Client method for version lookup by tag
* No longer double-declaring client
* [ui] Add tag filter to the job versions page (#24064)
* Rough pass at the UI for version diff dropdown
* Cleanup and diff fetching via adapter method
* TaggedVersion now VersionTag (#24066)
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
In #18925 we added a `-json` flag to the `job status` command, but the argument
handling had a bug where it would always set the `-json` flag if either the `-t`
or `-json` flags were set, resulting in a misleading error. Instead, pass the
`-json` flag value into the formatter.
Fixes: https://github.com/hashicorp/nomad/issues/24050
When we introduced change_mode=script to templates, we passed the driver handle
down into the template manager so we could call its `Exec` method directly. But
the lifecycle of the driver handle is managed by the taskrunner and isn't
available when the template manager is first created. This has led to a series
of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part
of the challenge in getting this right is using an interface to avoid the
circular import of the driver handle.
But the taskrunner already has a way to deal with this problem using a "lazy
handle". The other template change modes already use this indirectly through the
`Lifecycle` interface. Change the driver handle `Exec` call in the template
manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This
eliminates the need for the template manager to know anything at all about the
handle state.
Fixes: https://github.com/hashicorp/nomad/issues/24051
The main point of this dependency upgrade is to pull in the fixes in
hashicorp/yamux#127 which prevents leaking deadlocked goroutines. It has
been observed to improve the issue in hashicorp/nomad#23305 but does not
seem sufficient to fix it entirely.
Since touching yamux is a rare and scary event, I do **not** intend to
backport this. If we discover the improvements are stable and
significant enough, or if further fixes land in yamux, backporting can
be done at that time.
Fixes#20335
The major change between Raft v1.6 -> v1.7 was the introduction of the
Prevote feature. Before Prevote, when a partitioned node rejoins a
cluster it may cause an election even if the cluster was stable. Prevote
can avoid this useless election so reintroducing partitioned servers to
an otherwise stable cluster becomes seamless.
Full details: https://github.com/hashicorp/raft/pull/530
In #20335 we discussed whether or not to add a configuration option to
disable prevote in case bugs were discovered. While bugs have been found
(hence the v1.7.1 version as opposed to v1.7.0), I'm choosing to follow
Vault's lead of straightfordwardly bumping the raft dependency:
hashicorp/vault#27605 and hashicorp/vault#28218
In #23977 we moved the keyring into Raft, which can expose key material in Raft
snapshots when using the less-secure AEAD keyring instead of KMS. This changeset
adds tools for redacting this material from snapshots:
* The `operator snapshot state` command gains the ability to display key
metadata (only), which respects the `-filter` option.
* The `operator snapshot save` command gains a `-redact` option that removes key
material from the snapshot after it's downloaded.
* A new `operator snapshot redact` command allows removing key material from an
existing snapshot.
When we start the Consul agent in the `consulcompat` test package, we check that
the version matches the version we expect. But Consul agents may omit non-core
parts of the version string (ex. `1.20.0-rc1` displays `1.20.0`). Compare only
the core portions of the version string.
In #23977 we merged a change to how the keyring was stored. Because keyring
initialization takes slightly longer now, this uncovered existing timing bugs in
some of our tests where tests that require the keyring (ex. plan applier tests)
were waiting for the leader but not the keyring initialization. Fix some of the
examples we've seen cause test flakes.
* update changelog for 1.8.4 release
* changelog: add 1.8.4 backport changelog notes
I botched the changelog bits of the checklist, adding the backport notes
to the CE changelog now.
In Nomad 1.4, we implemented a root keyring to support encrypting Variables and
signing Workload Identities. The keyring was originally stored with the
AEAD-wrapped DEKs and the KEK together in a JSON keystore file on disk. We
recently added support for using an external KMS for the KEK to improve the
security model for the keyring. But we've encountered multiple instances of the
keystore files not getting backed up separately from the Raft snapshot,
resulting in failure to restore clusters from backup.
Move Nomad's root keyring into Raft (encrypted with a KMS/Vault where available)
in order to eliminate operational problems with the separate on-disk keystore.
Fixes: https://github.com/hashicorp/nomad/issues/23665
Ref: https://hashicorp.atlassian.net/browse/NET-10523
We're releasing the beta for Nomad 1.9.0 shortly. Bumping the base version now
will make it easier to test out new features that require a version
check. Builds from `main` will show as `1.9.0-dev`.
Resolve scan job runner
Resolve linting alerts
adding EOF on files
adding EOF on gitignore too
add hclfmt and bump action versions
update scan.hcl comments
Co-authored-by: Tim Gross <tgross@hashicorp.com>
fix typo
move scan.hcl file and paths-ignore for scans
change action runner
use org secret to checkout
typo
change runner
use hashicorp/setup-golang@v3
Co-authored-by: Tim Gross <tgross@hashicorp.com>
pin the github action sha
so more than one copy of a program can run
at a time on the same port with SO_REUSEPORT.
requires host network mode.
some task drivers (like docker) may also need
config {
network_mode = "host"
}
but this is not validated prior to placement.
The landlock fingerprint test assumes there's no version of the landlock API
>3. Update the test assertion to allow for the current v4 and any future
versions.
While working on #23655 I found there were a few places in the encrypter/keyring
where we could make modest improvements to performance and reliability of the
existing code.
This changeset allows keyring replication to skip trying to replicate from
itself, switches some of the read-only keyring accesses to use the read lock
instead of a r/w lock, fixes the logging configuration to drop spurious "extra
value" warnings in the logs, drops an unused type, and makes a minor refactoring
to eliminate shadowing of the `keyset` type. Pulling this out to its own PR lets
us backport these changes to the LTS and reduces the size of the PR that
implements #23665.
Ref https://github.com/hashicorp/nomad/issues/23665
when a CNI result includes an IPv6 address,
set it on the alloc's NetworkStatus for reference.
e.g.:
$ nomad alloc status -json 3dca | jq '.NetworkStatus'
{
"Address": "172.26.64.14",
"AddressIPv6": "fd00:a110:c8::b",
"DNS": null,
"InterfaceName": "eth0"
}