19 Commits

Author SHA1 Message Date
James Rasell
5989d5862a ci: Update golangci-lint to v2 and fix highlighted issues. (#26334) 2025-07-25 10:44:08 +01:00
Martijn Vegter
736103aa54 client: fix JSON formatted logs when failing to reserve cores (#25523)
Fixed a bug where JSON formatted logs would not show the requested and overlapping 
cores when failing to reserve cores
2025-03-27 08:52:32 -04:00
Martijn Vegter
997da25cdb scheduler: take all assigned cpu cores into account instead of only those part of the largest lifecycle (#24304)
Fixes a bug in the AllocatedResources.Comparable method, where the scheduler
would only take into account the cpusets of the tasks in the largest lifecycle.
This could result in overlapping cgroup cpusets. Now we make the distinction
between reserved and fungible resources throughout the lifespan of the alloc.
In addition, added logging in case of future regressions thus not requiring
manual inspection of cgroup files.
2024-11-21 13:21:48 -05:00
Seth Hoenig
51215bf102 deps: update to go-set/v3 and refactor to use custom iterators (#23971)
* deps: update to go-set/v3

* deps: use custom set iterators for looping
2024-09-16 13:40:10 -05:00
Juana De La Cuesta
9c5f962940 Update client/lib/cgroupslib/partition_linux.go
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-09-06 10:56:47 +02:00
Juana De La Cuesta
426c225dc2 Update client/lib/cgroupslib/partition_linux.go
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-09-06 10:56:41 +02:00
Juana De La Cuesta
8e6d85b66f Update client/lib/cgroupslib/partition_linux.go
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-09-06 10:56:36 +02:00
Juanadelacuesta
a65d05ff51 fix: keep a register of the usable cores to avoid using more than that 2024-09-05 17:02:54 +02:00
Tim Gross
682c8c0c81 cgroupslib: allow initial controller check with delegated cgroups v2 (#23803)
During Nomad client initialization with cgroups v2, we assert that the required
cgroup controllers are available in the root `cgroup.subtree_control` file by
idempotently writing to the file. But if Nomad is running with delegated
cgroups, this will fail file permissions checks even if the subtree control file
already has the controllers we need.

Update the initialization to first check if the controllers are missing before
attempting to write to them. This allows cgroup delegation so long as the
cluster administrator has pre-created a Nomad owned cgroups tree and set the
`Delegate` option in a systemd override. If not, initialization fails in the
existing way.

Although this is one small step along the way to supporting a rootless Nomad
client, running Nomad as non-root is still unsupported. I've intentionally not
documented setting up cgroup delegation in this PR, as this PR is insufficient
by itself to have a secure and properly-working rootless Nomad client.

Ref: https://github.com/hashicorp/nomad/issues/18211
Ref: https://github.com/hashicorp/nomad/issues/13669
2024-08-14 16:58:21 -04:00
Juana De La Cuesta
169818b1bd [gh-6980] Client: clean up old allocs before running new ones using the exec task driver. (#20500)
Whenever the "exec" task driver is being used, nomad runs a plug in that in time runs the task on a container under the hood. If by any circumstance the executor is killed, the task is reparented to the init service and wont be stopped by Nomad in case of a job updated or stop.

This commit introduces two mechanisms to avoid this behaviour:

* Adds signal catching and handling to the executor, so in case of a SIGTERM, the signal will also be passed on to the task.
* Adds a pre start clean up of the processes in the container, ensuring only the ones the executor runs are present at any given time.
2024-05-14 09:51:27 +02:00
Seth Hoenig
14a022cbc0 drivers/raw_exec: enable setting cgroup override values (#20481)
* drivers/raw_exec: enable setting cgroup override values

This PR enables configuration of cgroup override values on the `raw_exec`
task driver. WARNING: setting cgroup override values eliminates any
gauruntee Nomad can make about resource availability for *any* task on
the client node.

For cgroup v2 systems, set a single unified cgroup path using `cgroup_v2_override`.
The path may be either absolute or relative to the cgroup root.

config {
  cgroup_v2_override = "custom.slice/app.scope"
}

or

config {
  cgroup_v2_override = "/sys/fs/cgroup/custom.slice/app.scope"
}

For cgroup v1 systems, set a per-controller path for each controller using
`cgroup_v1_override`. The path(s) may be either absolute or relative to
the controller root.

config {
  cgroup_v1_override = {
    "pids": "custom/app",
    "cpuset": "custom/app",
  }
}

or

config {
  cgroup_v1_override = {
    "pids": "/sys/fs/cgroup/pids/custom/app",
    "cpuset": "/sys/fs/cgroup/cpuset/custom/app",
  }
}

* drivers/rawexec: ensure only one of v1/v2 cgroup override is set

* drivers/raw_exec: executor should error if setting cgroup does not work

* drivers/raw_exec: create cgroups in raw_exec tests

* drivers/raw_exec: ensure we fail to start if custom cgroup set and non-root

* move custom cgroup func into shared file

---------

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2024-05-07 16:46:27 -07:00
Luiz Aoqui
db5ffde2b7 client: prevent start on cgroups init error (#19915)
The Nomad client expects certain cgroups paths to exist in order to
manage tasks. These paths are created when the agent first starts, but
if process fails the agent would just log the error and proceed with its
initialization, despite not being able to run tasks.

This commit surfaces the errors back to the client initialization so the
process can stop early and make clear to operators that something went
wrong.
2024-02-09 13:45:29 -05:00
David Ventura
fb43b14fb0 Mark CGroups as off when missing essential controllers (#19176) 2023-12-15 11:20:52 -05:00
Seth Hoenig
8ed82416e3 client: fix detection of cpuset.mems on cgroups v1 systems (#18868) 2023-10-26 09:42:10 -05:00
Seth Hoenig
e3c8700ded deps: upgrade to go-set/v2 (#18638)
No functional changes, just cleaning up deprecated usages that are
removed in v2 and replace one call of .Slice with .ForEach to avoid
making the intermediate copy.
2023-10-05 11:56:17 -05:00
Seth Hoenig
2e1974a574 client: refactor cpuset partitioning (#18371)
* client: refactor cpuset partitioning

This PR updates the way Nomad client manages the split between tasks
that make use of resources.cpus vs. resources.cores.

Previously, each task was explicitly assigned which CPU cores they were
able to run on. Every time a task was started or destroyed, all other
tasks' cpusets would need to be updated. This was inefficient and would
crush the Linux kernel when a client would try to run ~400 or so tasks.

Now, we make use of cgroup heirarchy and cpuset inheritence to efficiently
manage cpusets.

* cr: tweaks for feedback
2023-09-12 09:11:11 -05:00
Piotr Kazmierczak
53ef6391a5 drivers/docker: fix a hostConfigMemorySwappiness panic (#18238)
cgroupslib.MaybeDisableMemorySwappiness returned an incorrect type, and was
incorrectly typecast to int64 causing a panic on non-linux and non-windows hosts.
2023-08-17 14:45:31 +02:00
hashicorp-copywrite[bot]
2d35e32ec9 Update copyright file headers to BUSL-1.1 2023-08-10 17:27:15 -05:00
Seth Hoenig
a4cc76bd3e numa: enable numa topology detection (#18146)
* client: refactor cgroups management in client

* client: fingerprint numa topology

* client: plumb numa and cgroups changes to drivers

* client: cleanup task resource accounting

* client: numa client and config plumbing

* lib: add a stack implementation

* tools: remove ec2info tool

* plugins: fixup testing for cgroups / numa changes

* build: update makefile and package tests and cl
2023-08-10 17:05:30 -05:00