In #18925 we added a `-json` flag to the `job status` command, but the argument
handling had a bug where it would always set the `-json` flag if either the `-t`
or `-json` flags were set, resulting in a misleading error. Instead, pass the
`-json` flag value into the formatter.
Fixes: https://github.com/hashicorp/nomad/issues/24050
When we introduced change_mode=script to templates, we passed the driver handle
down into the template manager so we could call its `Exec` method directly. But
the lifecycle of the driver handle is managed by the taskrunner and isn't
available when the template manager is first created. This has led to a series
of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part
of the challenge in getting this right is using an interface to avoid the
circular import of the driver handle.
But the taskrunner already has a way to deal with this problem using a "lazy
handle". The other template change modes already use this indirectly through the
`Lifecycle` interface. Change the driver handle `Exec` call in the template
manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This
eliminates the need for the template manager to know anything at all about the
handle state.
Fixes: https://github.com/hashicorp/nomad/issues/24051
The main point of this dependency upgrade is to pull in the fixes in
hashicorp/yamux#127 which prevents leaking deadlocked goroutines. It has
been observed to improve the issue in hashicorp/nomad#23305 but does not
seem sufficient to fix it entirely.
Since touching yamux is a rare and scary event, I do **not** intend to
backport this. If we discover the improvements are stable and
significant enough, or if further fixes land in yamux, backporting can
be done at that time.
Fixes#20335
The major change between Raft v1.6 -> v1.7 was the introduction of the
Prevote feature. Before Prevote, when a partitioned node rejoins a
cluster it may cause an election even if the cluster was stable. Prevote
can avoid this useless election so reintroducing partitioned servers to
an otherwise stable cluster becomes seamless.
Full details: https://github.com/hashicorp/raft/pull/530
In #20335 we discussed whether or not to add a configuration option to
disable prevote in case bugs were discovered. While bugs have been found
(hence the v1.7.1 version as opposed to v1.7.0), I'm choosing to follow
Vault's lead of straightfordwardly bumping the raft dependency:
hashicorp/vault#27605 and hashicorp/vault#28218
In #23977 we moved the keyring into Raft, which can expose key material in Raft
snapshots when using the less-secure AEAD keyring instead of KMS. This changeset
adds tools for redacting this material from snapshots:
* The `operator snapshot state` command gains the ability to display key
metadata (only), which respects the `-filter` option.
* The `operator snapshot save` command gains a `-redact` option that removes key
material from the snapshot after it's downloaded.
* A new `operator snapshot redact` command allows removing key material from an
existing snapshot.
When we start the Consul agent in the `consulcompat` test package, we check that
the version matches the version we expect. But Consul agents may omit non-core
parts of the version string (ex. `1.20.0-rc1` displays `1.20.0`). Compare only
the core portions of the version string.
In #23977 we merged a change to how the keyring was stored. Because keyring
initialization takes slightly longer now, this uncovered existing timing bugs in
some of our tests where tests that require the keyring (ex. plan applier tests)
were waiting for the leader but not the keyring initialization. Fix some of the
examples we've seen cause test flakes.
* update changelog for 1.8.4 release
* changelog: add 1.8.4 backport changelog notes
I botched the changelog bits of the checklist, adding the backport notes
to the CE changelog now.
In Nomad 1.4, we implemented a root keyring to support encrypting Variables and
signing Workload Identities. The keyring was originally stored with the
AEAD-wrapped DEKs and the KEK together in a JSON keystore file on disk. We
recently added support for using an external KMS for the KEK to improve the
security model for the keyring. But we've encountered multiple instances of the
keystore files not getting backed up separately from the Raft snapshot,
resulting in failure to restore clusters from backup.
Move Nomad's root keyring into Raft (encrypted with a KMS/Vault where available)
in order to eliminate operational problems with the separate on-disk keystore.
Fixes: https://github.com/hashicorp/nomad/issues/23665
Ref: https://hashicorp.atlassian.net/browse/NET-10523
We're releasing the beta for Nomad 1.9.0 shortly. Bumping the base version now
will make it easier to test out new features that require a version
check. Builds from `main` will show as `1.9.0-dev`.
Resolve scan job runner
Resolve linting alerts
adding EOF on files
adding EOF on gitignore too
add hclfmt and bump action versions
update scan.hcl comments
Co-authored-by: Tim Gross <tgross@hashicorp.com>
fix typo
move scan.hcl file and paths-ignore for scans
change action runner
use org secret to checkout
typo
change runner
use hashicorp/setup-golang@v3
Co-authored-by: Tim Gross <tgross@hashicorp.com>
pin the github action sha
so more than one copy of a program can run
at a time on the same port with SO_REUSEPORT.
requires host network mode.
some task drivers (like docker) may also need
config {
network_mode = "host"
}
but this is not validated prior to placement.
The landlock fingerprint test assumes there's no version of the landlock API
>3. Update the test assertion to allow for the current v4 and any future
versions.
While working on #23655 I found there were a few places in the encrypter/keyring
where we could make modest improvements to performance and reliability of the
existing code.
This changeset allows keyring replication to skip trying to replicate from
itself, switches some of the read-only keyring accesses to use the read lock
instead of a r/w lock, fixes the logging configuration to drop spurious "extra
value" warnings in the logs, drops an unused type, and makes a minor refactoring
to eliminate shadowing of the `keyset` type. Pulling this out to its own PR lets
us backport these changes to the LTS and reduces the size of the PR that
implements #23665.
Ref https://github.com/hashicorp/nomad/issues/23665
when a CNI result includes an IPv6 address,
set it on the alloc's NetworkStatus for reference.
e.g.:
$ nomad alloc status -json 3dca | jq '.NetworkStatus'
{
"Address": "172.26.64.14",
"AddressIPv6": "fd00:a110:c8::b",
"DNS": null,
"InterfaceName": "eth0"
}
The documentation is referring to a `file` attribute that does not exist on the `vault` block.
This PR changes those references to mention the `disable_file` attribute instead.