This introduces a docker image based off of ubuntu:bionic that can be used to
compile Nomad binary against glibc 2.27.
The image cannot build JS assets, which must be created before we compile the
Go binary.
* Use core ID when selecting cores
If the available cores are not a continuous set, the core selector might
panic when trying to select cores.
For example, consider a scenario where the available cores for the selector are the following:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]
This list contains 46 cores, because cores with IDs 0 and 24 are not
included in the list
Before this patch, if we requested 46 cores, the selector would panic
trying to access the item with index 46 in `cs.topology.Cores`.
This patch changes the selector to use the core ID instead when looking
for a core inside `cs.topology.Cores`. This prevents an out of bounds
access that was causing the panic.
Note: The patch is straightforward with the change. Perhaps a better
long-term solution would be to restructure the `numalib.Topology.Cores`
field to be a `map[ID]Core`, but that is a much larger change that is
more difficult to land. Also, the amount of cores in our case is
small—at most 192—so a search won't have any noticeable impact.
* Add changelog entry
* Build list of IDs inline
In #25496 we introduced the ability to have `task.user` set for on Windows, so
long as the user ID fits a particular shape. But this uncovered a 7 year old bug
in the `java` driver introduced in #5143, where we set the `task.user` to the
non-existent Unix user `nobody`, even if we're running on Windows.
Prior to the change in #25496 we always ignored the `task.user`, so this was not
a problem. We don't set the `task.user` in the `raw_exec` driver, and the
otherwise very similar `exec` driver is Linux-only, so we never see the problem
there.
Fix the bug in the `java` driver by gating the change to the `task.user` on not
being Windows. Also add a check to the new code path that the user is non-empty
before parsing it, so that any third party drivers that might be borrowing the
executor code don't hit the same probem on Windows.
Ref: https://github.com/hashicorp/nomad/pull/5143
Ref: https://github.com/hashicorp/nomad/pull/25496
Fixes: https://github.com/hashicorp/nomad/issues/25638
* Add note to root keyring remove command
This PR updates the documentation for the root keyring remove command to note that the full key ID must be provided for the command to function correctly.
* Move keyID explanation to usage section
---------
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
* Docs: 1.10 release notes and upgrade factoring
* Update based on code review suggestions
* add CLI for disabling UI URL hints
* fix indentation
* nav: list release notes in reverse order
fix broken link to v1.6.x docs
* Update PKCE section from Daniel's latest PR
* update pkce per daniel's suggestion
* Add dynamic host volumes governance section from blog
The `ui.enabled` parameter is a non-pointer bool which means the
merge function is unable to differentiate between false and not
set. When e2e introduced the `ui.show_cli_hints` configuration
parameter, the way we merge meant the UI became disabled.
I couldn't find any reason the exec2 HTTP jobs were not being run
with a generated cleanup function, so I added this.
The deletion of the DHV ACL policy does not seem like it would
have any negative impact.
* allow for newline flexibility in client assertion key/cert
* if client assertion, don't send the client secret,
but do keep the client secret in both places in state
(on the parent Config, and within the OIDCClientAssertion)
mainly so that it shows up as "redacted" instead of empty
when inspecting the auth method config via API.
trying not to violate the principle of least astonishment.
we want to only auto-enable PKCE on *new* auth methods,
rather than *new or updated* auth methods, to avoid a
scenario where a Nomad admin updates an auth method
sometime in the future -- something innocent like a new
client secret -- and their OIDC provider doesn't like PKCE.
the main concern is that the provider won't like PKCE
in a totally confusing way. error messages rarely
say PKCE directly, so why the user's auth method
suddenly broke would be a big mystery.
this means that to enable it on existing auth methods,
you would set `OIDCDisablePKCE = false`, and the double-
negative doesn't feel right, so instead, swap the language,
so enabling it on *existing* methods reads sensibly, and to
disable it on *new* methods reads ok-enough:
`OIDCEnablePKCE = false`
The CSI workload we're using for upgrade testing seems to be flaky to come
up. The plugin jobs don't launch in a timely fashion despite several
attempts. In order to not block running the rest of the upgrade testing, let's
disable this workload temporarily. We'll fix this in NET-12430.
Ref: https://hashicorp.atlassian.net/browse/NET-12430
* modify rawexec TaskConfig and Config to accept envvar denylist
* update rawexec driver docs to include deniedEnvars options
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
Testing found that if you create or register a dynamic host volume in a
non-existing namespace, the volume gets created on the client but then we can't
write it to state. Add a check for this in the initial validation.
In #24658 we fixed a bug around client restarts where we would not assert
network namespaces existed and were properly configured when restoring
allocations. We introduced a call to the CNI `Check` method so that the plugins
could report correct config. But when we get an error from this call, we don't
log it unless the error is fatal. This makes it challenging to debug the case
where the initial check fails but we tear down the network and try again (as
described in #25510). Add a noisy log line here.
Ref: https://github.com/hashicorp/nomad/pull/24658
Ref: https://github.com/hashicorp/nomad/issues/25510
All Nomad servers should be running v1.10.0 before the DHV feature
can be used. Without this, it is possible for a write to succeed
and cause immediate loss and subsequent failure to establish
leadership.
* Docs: Fix broken links in main for 1.10 release
* Implement Tim's suggestions
* Remove link to Portworx from ecosystem page
* remove "Portworx" since Portworx 3.2 no longer supports Nomad