nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 17:05:43 +03:00

Author	SHA1	Message	Date
Tim Gross	7d73065066	numa: fix scheduler panic due to topology serialization bug (#23284 ) The NUMA topology struct field `NodeIDs` is a `idset.Set`, which has no public members. As a result, this field is never serialized via msgpack and persisted in state. When `numa.affinity = "prefer"`, the scheduler dereferences this nil field and panics the scheduler worker. Ideally we would fix this by adding a msgpack serialization extension, but because the field already exists and is just always empty, this breaks RPC wire compatibility across upgrades. Instead, create a new field that's populated at the same time we populate the more useful `idset.Set`, and repopulate the set on demand. Fixes: https://hashicorp.atlassian.net/browse/NET-9924	2024-06-11 08:55:00 -04:00
Tim Gross	61608e43cb	test: move NUMA platform scan out of testing global (#23289 ) The `testing.go` test helpers file for the driver manager initializes the NUMA scan as a package-global variable. This causes it to be pulled in even in production builds, so even running commands like `nomad version` will cause the NUMA scan to happen. Move the scan into the test helper setup.	2024-06-11 08:52:51 -04:00
nicoche	ffcb72bfe3	api: Add Notes field to service checks (#22397 ) Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2024-06-10 16:59:49 +02:00
Seth Hoenig	45da80bde2	client: cleanup empty task directory when using unveil filesystem isolation (#23237 ) This PR fixes a bug where Nomad client would leave behind an empty directory created on behalf of tasks making use of the unveil filesystem isolation mode (i.e. using exec2 task driver). Once unmounting is complete, we should remember to also delete the directory. Fixes #22433	2024-06-06 10:47:23 -05:00
Tim Gross	140747240f	consul: include admin partition in JWT login requests (#22226 ) When logging into a JWT auth method, we need to explicitly supply the Consul admin partition if the local Consul agent is in a partition. We can't derive this from agent configuration because the Consul agent's configuration is canonical, so instead we get the partition from the fingerprint (if available). This changeset updates the Consul client constructor so that we close over the partition from the fingerprint. Ref: https://hashicorp.atlassian.net/browse/NET-9451	2024-05-29 16:31:09 -04:00
Daniel Bennett	4415fabe7d	jobspec: time based task execution (#22201 ) this is the CE side of an Enterprise-only feature. a job trying to use this in CE will fail to validate. to enable daily-scheduled execution entirely client-side, a job may now contain: task "name" { schedule { cron { start = "0 12 * * * *" # may not include "," or "/" end = "0 16" # partial cron, with only {minute} {hour} timezone = "EST" # anything in your tzdb } } ... and everything about the allocation will be placed as usual, but if outside the specified schedule, the taskrunner will block on the client, waiting on the schedule start, before proceeding with the task driver execution, etc. this includes a taksrunner hook, which watches for the end of the schedule, at which point it will kill the task. then, restarts-allowing, a new task will start and again block waiting for start, and so on. this also includes all the plumbing required to pipe API calls through from command->api->agent->server->client, so that tasks can be force-run, force-paused, or resume the schedule on demand.	2024-05-22 15:40:25 -05:00
Seth Hoenig	09bd11383c	client: alloc_mounts directory must be sibling of data directory (#22199 ) This PR adjusts the default location of -alloc-mounts-dir path to be a sibling of the -data-dir path rather than a child. This is because on a production-hardened systems the data dir is supposed to be chmod 0700 owned by root - preventing the exec2 task driver (and others using unveil file system isolation features) from working properly. For reference the directory structure from -data-dir now looks like this after running an example job. Under the alloc_mounts directory, task specific directories are mode 0710 and owned by the task user (which may be a dynamic user UID/GID). ➜ sudo tree -p -d -u /tmp/mynomad [drwxrwxr-x shoenig ] /tmp/mynomad ├── [drwx--x--x root ] alloc_mounts │ └── [drwx--x--- 80552 ] c753b71d-c6a1-3370-1f59-47ab838fd8a6-mytask │ ├── [drwxrwxrwx nobody ] alloc │ │ ├── [drwxrwxrwx nobody ] data │ │ ├── [drwxrwxrwx nobody ] logs │ │ └── [drwxrwxrwx nobody ] tmp │ ├── [drwxrwxrwx nobody ] local │ ├── [drwxr-xr-x root ] private │ ├── [drwx--x--- 80552 ] secrets │ └── [drwxrwxrwt nobody ] tmp └── [drwx------ root ] data ├── [drwx--x--x root ] alloc │ └── [drwxr-xr-x root ] c753b71d-c6a1-3370-1f59-47ab838fd8a6 │ ├── [drwxrwxrwx nobody ] alloc │ │ ├── [drwxrwxrwx nobody ] data │ │ ├── [drwxrwxrwx nobody ] logs │ │ └── [drwxrwxrwx nobody ] tmp │ └── [drwx--x--- 80552 ] mytask │ ├── [drwxrwxrwx nobody ] alloc │ │ ├── [drwxrwxrwx nobody ] data │ │ ├── [drwxrwxrwx nobody ] logs │ │ └── [drwxrwxrwx nobody ] tmp │ ├── [drwxrwxrwx nobody ] local │ ├── [drwxrwxrwx nobody ] private │ ├── [drwx--x--- 80552 ] secrets │ └── [drwxrwxrwt nobody ] tmp ├── [drwx------ root ] client └── [drwxr-xr-x root ] server ├── [drwx------ root ] keystore ├── [drwxr-xr-x root ] raft │ └── [drwxr-xr-x root ] snapshots └── [drwxr-xr-x root ] serf 32 directories	2024-05-22 13:14:34 -05:00
Tim Gross	b1657dd1fa	CSI: track node claim before staging to prevent interleaved unstage (#20550 ) The CSI hook for each allocation that claims a volume runs concurrently. If a call to `MountVolume` happens at the same time as a call to `UnmountVolume` for the same volume, it's possible for the second alloc to detect the volume has already been staged, then for the original alloc to unpublish and unstage it, only for the second alloc to then attempt to publish a volume that's been unstaged. The usage tracker on the volume manager was intended to prevent this behavior but the call to claim the volume was made only after staging and publishing was complete. Move the call to claim the volume for the usage tracker to the top of the `MountVolume` workflow to prevent it from being unstaged until all consuming allocations have called `UnmountVolume`. Fixes: https://github.com/hashicorp/nomad/issues/20424	2024-05-16 09:45:07 -04:00
Tim Gross	953bfcc31e	services: retry failed Nomad service deregistrations from client (#20596 ) When the allocation is stopped, we deregister the service in the alloc runner's `PreKill` hook. This ensures we delete the service registration and wait for the shutdown delay before shutting down the tasks, so that workloads can drain their connections. However, the call to remove the workload only logs errors and never retries them. Add a short retry loop to the `RemoveWorkload` method for Nomad services, so that transient errors give us an extra opportunity to deregister the service before the tasks are stopped, before we need to fall back to the data integrity improvements implemented in #20590. Ref: https://github.com/hashicorp/nomad/issues/16616	2024-05-16 08:59:54 -04:00
Deniz Onur Duzgun	1cc99cc1b4	bug: resolve type conversion alerts (#20553 )	2024-05-15 13:22:10 -04:00
Seth Hoenig	4148ca1769	client: mount shared alloc dir as nobody (#20589 ) In the Unveil filesystem isolation mode we were mounting the shared alloc dir with the UID/GID of the user of the task dir being mounted and 0710 filesystem permissions. This was causing the actual task dir to become inaccessible to other tasks in the allocation (a race where the last mounter wins). Instead mount the shared alloc dir as nobody with 0777 filesystem permissions.	2024-05-15 10:43:30 -05:00
James Rasell	04ba358266	client: expose network namespace CNI config as task env vars. (#11810 ) This change exposes CNI configuration details of a network namespace as environment variables. This allows a task to use these value to configure itself; a potential use case is to run a Raft application binding to IP and Port details configured using the bridge network mode.	2024-05-14 09:02:06 +01:00
Juana De La Cuesta	169818b1bd	[gh-6980] Client: clean up old allocs before running new ones using the `exec` task driver. (#20500 ) Whenever the "exec" task driver is being used, nomad runs a plug in that in time runs the task on a container under the hood. If by any circumstance the executor is killed, the task is reparented to the init service and wont be stopped by Nomad in case of a job updated or stop. This commit introduces two mechanisms to avoid this behaviour: * Adds signal catching and handling to the executor, so in case of a SIGTERM, the signal will also be passed on to the task. * Adds a pre start clean up of the processes in the container, ensuring only the ones the executor runs are present at any given time.	2024-05-14 09:51:27 +02:00
Tim Gross	65ae61249c	CSI: include volume namespace in staging path (#20532 ) CSI volumes are namespaced. But the client does not include the namespace in the staging mount path. This causes CSI volumes with the same volume ID but different namespace to collide if they happen to be placed on the same host. The per-allocation paths don't need to be namespaced, because an allocation can only mount volumes from its job's own namespace. Rework the CSI hook tests to have more fine-grained control over the mock on-disk state. Add tests covering upgrades from staging paths missing namespaces. Fixes: https://github.com/hashicorp/nomad/issues/18741	2024-05-13 11:24:09 -04:00
Tim Gross	623486b302	deps: vendor containernetworking/plugins functions for net NS utils (#20556 ) We bring in `containernetworking/plugins` for the contents of a single file, which we use in a few places for running a goroutine in a specific network namespace. This code hasn't needed an update in a couple of years, and a good chunk of what we need was previously vendored into `client/lib/nsutil` already. Updating the library via dependabot is causing errors in Docker driver tests because it updates a lot of transient dependencies, and it's bringing in a pile of new transient dependencies like opentelemetry. Avoid this problem going forward by vendoring the remaining code we hadn't already. Ref: https://github.com/hashicorp/nomad/pull/20146	2024-05-13 09:10:16 -04:00
James Rasell	7e42ad869a	client: fix unallocated CPU metric when reserved cpu is set. (#20543 )	2024-05-09 10:55:22 +01:00
Seth Hoenig	14a022cbc0	drivers/raw_exec: enable setting cgroup override values (#20481 ) * drivers/raw_exec: enable setting cgroup override values This PR enables configuration of cgroup override values on the `raw_exec` task driver. WARNING: setting cgroup override values eliminates any gauruntee Nomad can make about resource availability for any task on the client node. For cgroup v2 systems, set a single unified cgroup path using `cgroup_v2_override`. The path may be either absolute or relative to the cgroup root. config { cgroup_v2_override = "custom.slice/app.scope" } or config { cgroup_v2_override = "/sys/fs/cgroup/custom.slice/app.scope" } For cgroup v1 systems, set a per-controller path for each controller using `cgroup_v1_override`. The path(s) may be either absolute or relative to the controller root. config { cgroup_v1_override = { "pids": "custom/app", "cpuset": "custom/app", } } or config { cgroup_v1_override = { "pids": "/sys/fs/cgroup/pids/custom/app", "cpuset": "/sys/fs/cgroup/cpuset/custom/app", } } * drivers/rawexec: ensure only one of v1/v2 cgroup override is set * drivers/raw_exec: executor should error if setting cgroup does not work * drivers/raw_exec: create cgroups in raw_exec tests * drivers/raw_exec: ensure we fail to start if custom cgroup set and non-root * move custom cgroup func into shared file --------- Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2024-05-07 16:46:27 -07:00
Tim Gross	f41bc468eb	consul: provide `CONSUL_HTTP_TOKEN` env var to tasks (#20519 ) When available, we provide an environment variable `CONSUL_TOKEN` to tasks, but this isn't the environment variable expected by the Consul CLI. Job specifications like deploying an API Gateway become noticeably nicer if we can instead provide the expected env var.	2024-05-03 11:30:33 -04:00
Seth Hoenig	5f64e42d73	client: fixup how alloc mounts directory are setup (#20463 )	2024-04-26 07:29:52 -05:00
Tim Gross	6d58acd897	WI: ensure tasks within same alloc get different Consul tokens (#20411 ) The `consul_hook` in the allocrunner gets a separate Consul token for each task, even if the tasks' identities have the same name, but used the identity name as the key to the alloc hook resources map. This means the last task in the group overwrites the Consul tokens of all other tasks. Fix this by adding the task name to the key in the allocrunner's `consul_hook`. And update the taskrunner's `consul_hook` to expect the task name in the key. Fixes: https://github.com/hashicorp/nomad/issues/20374 Fixes: https://hashicorp.atlassian.net/browse/NOMAD-614	2024-04-17 11:29:58 -04:00
Luiz Aoqui	9d4f7bcb68	mock_driver: fix fingreprint key (#20351 ) The `mock_driver` is an internal task driver used mostly for testing and simulating workloads. During the allocrunner v2 work (#4792) its name changed from `mock_driver` to just `mock` and then back to `mock_driver`, but the fingreprint key was kept as `driver.mock`. This results in tasks configured with `driver = "mock"` to be scheduled (because Nomad thinks the client has a task driver called `mock`), but fail to actually run (because the Nomad client can't find a driver called `mock` in its catalog). Fingerprinting the right name prevents the job from being scheduled in the first place. Also removes mentions of the mock driver from documentation since its an internal driver and not available in any production release.	2024-04-16 07:16:55 +01:00
Tim Gross	9cb1ef3e3d	CNI: fix bugs in parsing strings to port number integers (#20379 ) Ports are a maximum of uint16, but we have a few places in the recent tproxy code where we were parsing them as 64-bit wide integers and then downcasting them to `int`, which is technically unsafe and triggers code scanning alerts. In practice we've validated the range elsewhere and don't build for 32-bit platforms. This changeset fixes the parsing to make everything a bit more robust and silence the alert. Fixes: https://github.com/hashicorp/nomad-enterprise/security/code-scanning/444	2024-04-12 13:31:25 -04:00
Seth Hoenig	ae6c4c8e3f	deps: purge use of old x/exp packages (#20373 )	2024-04-12 08:29:00 -05:00
astudentofblake	7b7ed12326	func: Allow custom paths to be added the the getter landlock (#20349 ) * func: Allow custom paths to be added the the getter landlock Fixes: 20315 * fix: slices imports fix: more meaningful examples fix: improve documentation fix: quote error output	2024-04-11 15:17:33 -05:00
Tim Gross	d56e8ad1aa	WI: ensure Consul hook and WID manager interpolate services (#20344 ) Services can have some of their string fields interpolated. The new Workload Identity flow doesn't interpolate the services before requesting signed identities or using those identities to get Consul tokens. Add support for interpolation to the WID manager and the Consul tokens hook by providing both with a taskenv builder. Add an "interpolate workload" field to the WI handle to allow passing the original workload name to the server so the server can find the correct service to sign. This changeset also makes two related test improvements: * Remove the mock WID manager, which was only used in the Consul hook tests and isn't necessary so long as we provide the real WID manager with the mock signer and never call `Run` on it. It wasn't feasible to exercise the correct behavior without this refactor, as the mocks were bypassing the new code. * Fixed swapped expect-vs-actual assertions on the `consul_hook` tests. Fixes: https://github.com/hashicorp/nomad/issues/20025	2024-04-11 15:40:28 -04:00
Tim Gross	8298d39e78	Connect transparent proxy support Add support for Consul Connect transparent proxies Fixes: https://github.com/hashicorp/nomad/issues/10628	2024-04-10 11:00:18 -04:00
Tim Gross	4fef82e8e2	tproxy: refactor `getPortMapping` The `getPortMapping` method forces callers to handle two different data structures, but only one caller cares about it. We don't want to return a single map or slice because the `cni.PortMapping` object doesn't include a label field that we need for tproxy. Return a new datastructure that closes over both a slice of `cni.PortMapping` and a map of label to index in that slice.	2024-04-10 10:16:13 -04:00
Tim Gross	8eaf176868	client: fix IPv6 parsing for `client.servers` block (#20324 ) When the `client.servers` block is parsed, we split the port from the address. This does not correctly handle IPv6 addresses when they are in URL format (wrapped in brackets), which we require to disambiguate the port and address. Fix the parser to correctly split out the port and handle a missing port value for IPv6. Update the documentation to make the URL format requirement clear. Fixes: https://github.com/hashicorp/nomad/issues/20310	2024-04-08 15:06:27 -04:00
Tim Gross	76009d89af	tproxy: networking hook changes (#20183 ) When `transparent_proxy` block is present and the network mode is `bridge`, use a different CNI configuration that includes the `consul-cni` plugin. Before invoking the CNI plugins, create a Consul SDK `iptables.Config` struct for the allocation. This includes: * Use all the `transparent_proxy` block fields * The reserved ports are added to the inbound exclusion list so the alloc is reachable from outside the mesh * The `expose` blocks and `check` blocks with `expose=true` are added to the inbound exclusion list so health checks work. The `iptables.Config` is then passed as a CNI argument to the `consul-cni` plugin. Ref: https://github.com/hashicorp/nomad/issues/10628	2024-04-04 17:01:07 -04:00
Seth Hoenig	6ad648bec8	networking: Inject implicit constraints on CNI plugins when using bridge mode (#15473 ) This PR adds a job mutator which injects constraints on the job taskgroups that make use of bridge networking. Creating a bridge network makes use of the CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting with Nomad 1.5 these plugins are fingerprinted on each node, and as such we can ensure jobs are correctly scheduled only on nodes where they are available, when needed.	2024-03-27 16:11:39 -04:00
Tim Gross	15162917c1	cni: fix regression in falling back to DNS owned by `dockerd` (#20189 ) In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: https://github.com/hashicorp/nomad/issues/20174	2024-03-22 10:54:16 -04:00
Michael Schurter	23e4b7c9d2	Upgrade go-msgpack to v2 (#20173 ) Replaces #18812 Upgraded with: ``` find . -name '.go' -exec sed -i s/"github.com\/hashicorp\/go-msgpack\/codec"/"github.com\/hashicorp\/go-msgpack\/v2\/codec/" '{}' ';' find . -name '.go' -exec sed -i s/"github.com\/hashicorp\/net-rpc-msgpackrpc"/"github.com\/hashicorp\/net-rpc-msgpackrpc\/v2/" '{}' ';' go get go get -v -u github.com/hashicorp/raft-boltdb/v2 go get -v github.com/hashicorp/serf@5d32001edfaa18d1c010af65db707cdb38141e80 ``` see https://github.com/hashicorp/go-msgpack/releases/tag/v2.1.0 for details	2024-03-21 11:44:23 -07:00
Tim Gross	7b9bce2d08	config: fix `client.template` config merging with defaults (#20165 ) When loading the client configuration, the user-specified `client.template` block was not properly merged with the default values. As a result, if the user set any `client.template` field, all the other field defaulted to their zero values instead of the documented defaults. This changeset: * Adds the missing `Merge` method for the client template config and ensures it's called. * Makes a single source of truth for the default template configuration, instead of two different constructors. * Extends the tests to cover the merge of a partial block better. Fixes: https://github.com/hashicorp/nomad/issues/20164	2024-03-20 10:18:56 -04:00
Charlie Voiselle	7b27bc344b	[refactor] Move task directory destroy logic from alloc_dir.go to task_dir.go (#20006 ) * Move task directory destroy logic from alloc_dir to task_dir * Update errors to wrap error cause * Use constants for file permissions * Make multierror handling consistent. * Make helpers for directory creation * Move mount dir unlink to task_dir Unlink method * Make constant for file mode 710 Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2024-03-19 13:49:09 -04:00
Tim Gross	13617eee4b	template: improve internal documentation around shutdown (#20134 ) While investigating a report around possible consul-template shutdown issues, which didn't bear fruit, I found that some of the logic around template runner shutdown is unintuitive. * Add some doc strings to the places where someone might think we should be obviously stopping the runner or returning early. * Mark context argument for `Poststart`, `Stop`, and `Update` hooks as unused. No functional code changes.	2024-03-14 15:33:32 -04:00
Amir Abbas	40b8f17717	Support insecure flag on artifact (#20126 )	2024-03-14 10:59:20 -05:00
Seth Hoenig	bb54d16e4a	exec2: setup RPC plumbing for dynamic workload users (#20129 ) And pass the dynamic users pool from the client into the hook.	2024-03-13 14:06:52 -05:00
Seth Hoenig	05937ab75b	exec2: add client support for unveil filesystem isolation mode (#20115 ) * exec2: add client support for unveil filesystem isolation mode This PR adds support for a new filesystem isolation mode, "Unveil". The mode introduces a "alloc_mounts" directory where tasks have user-owned directory structure which are bind mounts into the real alloc directory structure. This enables a task driver to use landlock (and maybe the real unveil on openbsd one day) to isolate a task to the task owned directory structure, providing sandboxing. * actually create alloc-mounts-dir directory * fix doc strings about alloc mount dir paths	2024-03-13 08:24:17 -05:00
carrychair	5f5b34db0e	remove repetitive words (#20110 ) Signed-off-by: carrychair <linghuchong404@gmail.com>	2024-03-11 08:52:08 +00:00
Seth Hoenig	286dce7a2a	exec2: add a client.users configuration block (#20093 ) * exec: add a client.users configuration block For now just add min/max dynamic user values; soon we can also absorb the "user.denylist" and "user.checked_drivers" options from the deprecated client.options map. * give the no-op pool implementation a better name * use explicit error types to make referencing them cleaner in tests * use import alias to not shadow package name	2024-03-08 16:02:32 -06:00
Seth Hoenig	2c1f5daad7	more test refactoring (#20092 ) * tests: swap testify for test in client/config * tests: swap testify for test in logmon/	2024-03-07 11:04:16 -06:00
Seth Hoenig	67554b8f91	exec2: implement dynamic workload users taskrunner hook (#20069 ) * exec2: implement dynamic workload users taskrunner hook This PR impelements a TR hook for allocating dynamic workload users from a pool managed by the Nomad client. This adds a new task driver Capability, DynamicWorkloadUsers - which a task driver must indicate in order to make use of this feature. The client config plumbing is coming in a followup PR - in the RFC we realized having a client.users block would be nice to have, with some additional unrelated options being moved from the deprecated client.options config. * learn to spell	2024-03-06 09:34:27 -06:00
Tim Gross	45b2c34532	cni: add DNS set by CNI plugins to task configuration (#20007 ) CNI plugins may set DNS configuration, but this isn't threaded through to the task configuration so that we can write it to the `/etc/resolv.conf` file as needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're accessible from the taskrunner. Any DNS entries provided by the user will override these values. Fixes: https://github.com/hashicorp/nomad/issues/11102	2024-02-20 10:17:27 -05:00
Juana De La Cuesta	20cfbc82d3	Introduces `Disconnect` block into the `TaskGroup` configuration (#19886 ) This PR is the first on two that will implement the new Disconnect block. In this PR the new block is introduced to be backwards compatible with the fields it will replace. For more information refer to this RFC and this ticket.	2024-02-19 16:41:35 +01:00
Tim Gross	a74775814c	fingerprint: add DNS address and port to Consul fingerprint (#19969 ) In order to provide a DNS address and port to Connect tasks configured for transparent proxy, we need to fingerprint the Consul DNS address and port. The client will pass this address/port to the iptables configuration provided to the `consul-cni` plugin. Ref: https://github.com/hashicorp/nomad/issues/10628	2024-02-14 12:15:58 -05:00
Cedric Le Roux	994a2b1036	client: fixed a bug where corrupt client state could panic the client (#19972 )	2024-02-14 11:14:11 -05:00
Luiz Aoqui	62b7d6ffe9	vault: revert #18998 to fix potential deadlock (#19963 ) * Revert "vault: always renew tokens using the renewal loop (#18998)" This reverts commit `7054fe1a8c`. * test: add case for concurrent Vault token renewal	2024-02-13 09:50:46 -05:00
Tim Gross	a54657899c	CNI: fix deprecation warnings (#19954 ) We updated our `go-cni` dependency in #17582 but this left deprecation warnings on the `cni.CNIResult` type (now `cni.Result`).	2024-02-12 15:35:43 -05:00
Luiz Aoqui	db5ffde2b7	client: prevent start on cgroups init error (#19915 ) The Nomad client expects certain cgroups paths to exist in order to manage tasks. These paths are created when the agent first starts, but if process fails the agent would just log the error and proceed with its initialization, despite not being able to run tasks. This commit surfaces the errors back to the client initialization so the process can stop early and make clear to operators that something went wrong.	2024-02-09 13:45:29 -05:00
Tim Gross	62c57d208b	fingerprint: eliminate spurious warning logs with Consul CE (#19923 ) Support for fingerprinting the Consul admin partition was added in #19485. But when the client fingerprints Consul CE, it gets a valid fingerprint and working Consul but with a warn-level log. Return "ok" from the partition extractor, but also ensure that we only add the Consul attribute if it actually has a value. Fixes: https://github.com/hashicorp/nomad/issues/19756	2024-02-09 08:19:00 -05:00

1 2 3 4 5 ...

4949 Commits