nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 00:45:43 +03:00

Author	SHA1	Message	Date
Seth Hoenig	1d2e2c0d3c	raw_exec: fixup review comments	2022-04-05 15:21:28 -05:00
Seth Hoenig	be7ec8de3e	raw_exec: make raw exec driver work with cgroups v2 This PR adds support for the raw_exec driver on systems with only cgroups v2. The raw exec driver is able to use cgroups to manage processes. This happens only on Linux, when exec_driver is enabled, and the no_cgroups option is not set. The driver uses the freezer controller to freeze processes of a task, issue a sigkill, then unfreeze. Previously the implementation assumed cgroups v1, and now it also supports cgroups v2. There is a bit of refactoring in this PR, but the fundamental design remains the same. Closes #12351 #12348	2022-04-04 16:11:38 -05:00
Seth Hoenig	c27af79add	client: cgroups v2 code review followup	2022-03-24 13:40:42 -05:00
Seth Hoenig	5da1a31e94	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
Seth Hoenig	87d54b8c21	client: change test to not poke cgroupv2 edge case This PR tweaks the TestCpusetManager_AddAlloc unit test to not break when being run on a machine using cgroupsv2. The behavior of writing an empty cpuset.cpu changes in cgroupv2, where such a group now inherits the value of its parent group, rather than remaining empty. The test in question was written such that a task would consume all available cores shared on an alloc, causing the empty set to be written to the shared group, which works fine on cgroupsv1 but breaks on cgroupsv2. By adjusting the test to consume only 1 core instead of all cores, it no longer triggers that edge case. The actual fix for the new cgroupsv2 behavior will be in #11933	2022-01-27 08:27:40 -06:00
Mahmood Ali	6c414cd5f9	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Nick Ethier	e0a599ed9c	nit: code cleanup/organization	2021-04-16 15:14:29 -04:00
Nick Ethier	e834a60de1	plugins/drivers: fix deprecated fields	2021-04-16 14:13:29 -04:00
Nick Ethier	1925f6b893	cgutil: set reserved mems on init even if already exist	2021-04-15 10:24:31 -04:00
Nick Ethier	4a25ec9410	testing fixes	2021-04-14 10:17:28 -04:00
Nick Ethier	355212c30c	cgutil: add nil check on AddAlloc	2021-04-13 13:28:36 -04:00
Nick Ethier	f897ac79e8	client/ar: thread through cpuset manager	2021-04-13 13:28:36 -04:00
Nick Ethier	5e7b411dec	cgutil: implement cpuset management as seperate package	2021-04-13 13:28:36 -04:00
Nick Ethier	84e44d53d0	Apply suggestions from code review Co-authored-by: Drew Bailey <drewbailey5@gmail.com>	2021-04-13 13:28:15 -04:00
Nick Ethier	cd8fb2d3e3	cgutil: fix lint errors	2021-04-13 13:28:15 -04:00
Nick Ethier	b8397a712d	fingerprint: implement client fingerprinting of reservable cores on Linux systems this is derived from the configure cpuset cgroup parent (defaults to /nomad) for non Linux systems and Linux systems where cgroups are not enabled, the client defaults to using all cores	2021-04-13 13:28:15 -04:00
Nick Ethier	dc08ec8783	ar: plumb client config for networking into the network hook	2019-07-31 01:04:06 -04:00
Nick Ethier	e15005bdcb	networking: Add new bridge networking mode implementation	2019-07-31 01:04:06 -04:00
Nick Ethier	56d5fe704a	ar: rearrange network hook to support building on windows	2019-07-31 01:03:19 -04:00
Danielle Lancashire	c712fdcbd9	fifo: Safer access to Conn	2019-07-02 13:12:54 +02:00
Danielle Lancashire	8148466da6	fifo: Close connections and cleanup lock handling	2019-07-01 14:14:29 +02:00
Danielle Lancashire	aff554deec	appveyor: Run logmon tests	2019-06-28 16:01:41 +02:00
Danielle Lancashire	e6daf3b5bd	fifo: Require that fifos do not exist for create Although this operation is safe on linux, it is not safe on Windows when using the named pipe interface. To provide a ~reasonable common api abstraction, here we switch to returning File exists errors on the unix api.	2019-06-28 13:47:18 +02:00
Danielle Lancashire	76f72fe4bd	vendor: Use dani fork of go-winio	2019-06-28 13:47:18 +02:00
Danielle Lancashire	efda81cbbb	logmon: Refactor fifo access for windows safety On unix platforms, it is safe to re-open fifo's for reading after the first creation if the file is already a fifo, however this is not possible on windows where this triggers a permissions error on the socket path, as you cannot recreate it. We can't transparently handle this in the CreateAndRead handle, because the Access Is Denied error is too generic to reliably be an IO error. Instead, we add an explict API for opening a reader to an existing FIFO, and check to see if the fifo already exists inside the calling package (e.g logmon)	2019-06-28 13:41:54 +02:00
Mahmood Ali	ea2f96e585	tests: fix fifo lib race Accidentally accessed outer `err` variable inside a goroutine	2019-05-21 09:49:56 -04:00
Mahmood Ali	714c41185c	rename fifo methods for clarity	2019-04-01 16:52:58 -04:00
Mahmood Ali	3c68c946c4	no requires in a test goroutine	2019-04-01 15:38:39 -04:00
Mahmood Ali	5ca9b6eb37	fifo: Use plain fifo file in Unix This PR switches to using plain fifo files instead of golang structs managed by containerd/fifo library. The library main benefit is management of opening fifo files. In Linux, a reader `open()` request would block until a writer opens the file (and vice-versa). The library uses goroutines so that it's the first IO operation that blocks. This benefit isn't really useful for us: Given that logmon simply streams output in a separate process, blocking of opening or first read is effectively the same. The library additionally makes further complications for managing state and tracking read/write permission that seems overhead for our use, compared to using a file directly. Looking here, I made the following incidental changes: * document that we do handle if fifo files are already created, as we rely on that behavior for logmon restarts * use type system to lock read vs write: currently, fifo library returns `io.ReadWriteCloser` even if fifo is opened for writing only!	2019-04-01 13:18:03 -04:00
Nick Ethier	a203689bbc	fifo: add new fifo package for named pipes (#4665 ) * fifo: add new fifo package for named pipes	2018-10-16 16:53:30 -07:00
Michael Schurter	6858c520b2	framer: fix early exit/truncation in framer	2018-05-02 10:46:16 -07:00
Michael Schurter	361db269c2	framer: fix race and remove unused error var In the old code `sending` in the `send()` method shared the Data slice's underlying backing array with its caller. Clearing StreamFrame.Data didn't break the reference from the sent frame to the StreamFramer's data slice.	2018-05-02 10:46:16 -07:00
Alex Dadgar	b0d0359b59	clarify force	2018-02-15 13:59:02 -08:00
Alex Dadgar	d77b36698c	HTTP and tests	2018-02-15 13:59:02 -08:00
Alex Dadgar	9d479f3d80	test stream framer	2018-02-15 13:59:01 -08:00
Alex Dadgar	5e7a1a44a2	Logs over RPC w/ lots to touch up	2018-02-15 13:59:01 -08:00

36 Commits