Add fallback logic in cniToAllocNet to populate the Address field with the IPv6 address when no IPv4 address is available. This change ensures compatibility with existing code that relies on the Address field for service registration, particularly in scenarios where only IPv6 addresses are present.
Add multiple test cases to validate the behavior of CNI allocation when handling IPv6-only and dualstack configurations. The new tests ensure that the first address is selected correctly from multiple addresses across interfaces, maintaining consistent behavior with IPv4. This improves coverage for edge cases in CNI result processing and reinforces the recent fixes for IPv6 support.
Fix a regression introduced in Nomad 1.9.0 where CNI bridge networking with
IPv6-only interfaces would fail with 'no interface with an address' error.
The issue was that while the code correctly populated the AddressIPv6 field
for IPv6 addresses, several validation checks only examined the Address field
(IPv4), causing IPv6-only configurations to be rejected.
Changes:
- Update interface selection logic in cniToAllocNet to accept interfaces with
either IPv4 or IPv6 addresses (not just IPv4)
- Update fallback logic to check both Address and AddressIPv6 fields
- Update error check to only fail when both IPv4 and IPv6 are missing
- Update AllocNetworkStatus.IsZero() to check AddressIPv6 field
This allows CNI configurations with IPv6-only interfaces to work correctly,
restoring functionality from Nomad 1.8.x.
Fixes#26905
Add a test case to verify that CNI results containing only IPv6 addresses
are handled correctly. This is a regression test for a bug introduced in
GH-23882 where IPv6-only interfaces would fail with 'no interface with an
address' error.
The test verifies that when a CNI plugin returns only an IPv6 address
(without IPv4), the allocation network status should be properly populated
with the IPv6 address in the AddressIPv6 field.
In #26831 we're preventing unexpected node RPCs by ensuring that the volume
watcher only unpublishes when allocations are client-terminal.
To mitigate any remaining similar issues, add serialization of node plugin RPCs,
as we did for controller plugin RPCs in #17996 and as recommended ("SHOULD") by
the CSI specification. Here we can do per-volume serialization rather than
per-plugin serialization.
Reorder the methods of the `volumeManager` in the client so that each interface
method and its directly-associated helper methods read from top-to-bottom,
instead of a mix of directions.
Ref: https://github.com/hashicorp/nomad/pull/17996
Ref: https://github.com/hashicorp/nomad/pull/26831
The test was incorrectly writing to state that registration had
been finished before writing the node identity token. This is the
opposite of what happens in the client code and caused a timing
issue which meant we read registration as completed before we had
the identity available and therefore returned the secret ID.
In Nomad Enterprise we can fingerprint multiple Consul datacenters. If neither
is `"default"` then we end up with warning logs about adding a "link".
The `Link` field on the `Node` struct is a map of attributes that only
contributes to the node's computed hash. The `"consul"` key's value is derived
from the `unique.consul.name` attribute, which only exists if there's a default
Consul cluster.
Update the fingerprint to skip setting the link field if there's no
`unique.consul.name`, and lower the warning log for malformed fields to debug;
this is a minor scheduling optimization largely captured by existing Consul
fields in the node computed class. The only reason not to remove it entirely is
to avoid changing computed classes on existing large clusters.
Fixes: https://github.com/hashicorp/nomad/issues/26781
Ref: https://hashicorp.atlassian.net/browse/NMD-998
The allocation network hook was not properly restoring network status from state when the network had previously been setup. This led to missing environment variables, misconfigured hosts file, and resolv.conf when a task was restarted after the nomad agent has restarted.
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
When checking if the target path is within the root path, the
target path is trimmed and then file information is fetched. If
the trimmed path does not exist, then the full target path is
not within the root. In the case of receiving a not exist error,
simply return false.
don't require "bridge" network mode when using connect{}
we document this as "at your own risk" because CNI configuration
is so flexible that we can't guarantee a user's network will work,
but Nomad's "bridge" CNI config may be used as a reference.
Currently every time a client starts, it creates a new consul token per service or task,. This PR changes the behaviour , it persists consul ACL token to the client state and it starts by looking up a token before creating a new one.
Fixes: #20184Fixes: #20185
This adds artifact inspection after download to detect any issues
with the content fetched. Currently this means checking for any
symlinks within the artifact that resolve outside the task or
allocation directories. On platforms where lockdown is available
(some Linux) this inspection is not performed.
The inspection can be disabled with the DisableArtifactInspection
option. A dedicated option for disabling this behavior allows
the DisableFilesystemIsolation option to be enabled but still
have artifacts inspected after download.
Corrects two minor bugs that prevented proper deployment monitoring for systems
jobs: populating the new deployment field of the system scheduler object, and
correcting allocrunner health checks that were guarded not to run on system
jobs.
When attempting to clone a git repository within a sandbox that is
configured with landlock, the clone will fail with error messages
related to inability to get random bytes for a temporary file.
Including a read rule for `/dev/urandom` resolves the error
and the git clone works as expected.
The Nomad clients store their Nomad identity in memory and within
their state store. While active, it is not possible to dump the
state to view the stored identity token, so having a way to view
the current claims while running aids debugging and operations.
This change adds a client identity workflow, allowing operators
to view the current claims of the nodes identity. It does not
return any of the signing key material.
This change implements the client -> server workflow for Nomad
node introduction. A Nomad node can optionally be started with an
introduction token, which is a signed JWT containing claims for
the node registration. The server handles this according to the
enforcement configuration.
The introduction token can be provided by env var, cli flag, or
by placing it within a default filesystem location. The latter
option does not override the CLI or env var.
The region claims has been removed from the initial claims set of
the intro identity. This boundary is guarded by mTLS and aligns
with the node identity.
* Add -log-file-export and -log-lookback commands to add historical log to
debug capture
* use monitor.PrepFile() helper for other historical log tests
* Add MonitorExport command and handlers
* Implement autocomplete
* Require nomad in serviceName
* Fix race in StreamReader.Read
* Add and use framer.Flush() to coordinate function exit
* Add LogFile to client/Server config and read NomadLogPath in rpcHandler instead of HTTPServer
* Parameterize StreamFixed stream size
The Nomad client will have its identity renewed according to the
TTL which defaults to 24h. In certain situations such as root
keyring rotation, operators may want to force clients to renew
their identities before the TTL threshold is met. This change
introduces a client HTTP and RPC endpoint which will instruct the
node to request a new identity at its next heartbeat. This can be
used via the API or a new command.
While this is a manual intervention step on top of the any keyring
rotation, it dramatically reduces the initial feature complexity
as it provides an asynchronous and efficient method of renewal that
utilises existing functionality.
Nomad servers, if upgraded, can return node identities as part of
the register and update/heartbeat response objects. The Nomad
client will now handle this and store it as appropriate within its
memory and statedb.
The client will now use any stored identity for RPC authentication
with a fallback to the secretID. This supports upgrades paths where
the Nomad clients are updated before the Nomad servers.
When the client handles an update status response from the server,
it modifies its heartbeat stop tracker with a time set once the
RPC call returns. It optionally also emits a log message, if the
client suspects it has missed a heartbeat.
These times were originally tracked by two different calls to the
time function which were executed 2 microseconds apart. There is
no reason we cannot use a single time variable for both uses which
saves us one whole call to time.Now.
* fix: initalize the topology of teh processors to avoid nil pointers
* func: initialize topology to avoid nil pointers
* fix: update the new public method for NodeProcessorResources
The Nomad client will persist its own identity within its state
store for restart persistence. The added benefit of using it over
the filesystem is that it supports transactions. This is useful
when considering the identity will be renewed periodically.
The mkdir plugin creates the directory and then chowns it. In the
event the chown command fails, we should attempt to remove the
directory. Without this, we leave directories on the client in
partial failure situations.
The `killTasks` function will kill all the alloc runners
task runners. If the task of a task runner has already
completed, the killing of the task runner can cause
confusion due to the task event showing that the task
was signaled even though it is already complete.
To prevent this, a check is done when creating the
task event to determine if the task has completed. If
it has no task event is created and when the task
runner is killed, no extra task event is added.
When a Nomad client register or re-registers, the RPC handler will
generate and return a node identity if required. When an identity
is generated, the signing key ID will be stored within the node
object, to ensure a root key is not deleted until it is not used.
During normal client operation it will periodically heartbeat to
the Nomad servers to indicate aliveness. The RPC handler that
is used for this action has also been updated to conditionally
perform identity generation. Performing it here means no extra RPC
handlers are required and we inherit the jitter in identity
generation from the heartbeat mechanism.
The identity generation check methods are performed from the RPC
request arguments, so they a scoped to the required behaviour and
can handle the nuance of each RPC. Failure to generate an identity
is considered terminal to the RPC call. The client will include
behaviour to retry this error which is always caused by the
encrypter not being ready unless the servers keyring has been
corrupted.