Commit Graph

2026 Commits

Author SHA1 Message Date
Preetha Appan
4c97c5ea0c Comments 2018-11-08 09:48:43 -06:00
Preetha Appan
6bb8e5aa58 Show preemption output in plan CLI 2018-11-08 09:48:43 -06:00
Alex Dadgar
57f40c7e3e Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Michael Schurter
8122c76cd6 Merge pull request #4828 from hashicorp/b-restore
Implement client agent restarting
2018-11-05 18:50:15 -06:00
Michael Schurter
d2e48e35c0 tests: get consul integration tests building 2018-11-05 12:32:05 -08:00
Preetha
6aa0c7ffa9 Merge pull request #4794 from hashicorp/f-preemption-systemjobs
Preemption for system jobs
2018-11-02 16:28:06 -05:00
Preetha Appan
4a35d62887 Fix formatting of allocation score metrics 2018-10-30 12:03:23 -05:00
Preetha Appan
f2b027797b Fix return type in tests after refactor 2018-10-30 11:10:46 -05:00
Preetha Appan
88005852e3 Introduce a response object for scheduler configuration 2018-10-30 11:06:32 -05:00
Preetha Appan
6966e3c3e8 Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan
784b96c104 Support for new scheduler config API, first use case is to disable preemption 2018-10-30 11:06:32 -05:00
Michael Schurter
0b4e15c366 tests: more fixes due to api changes 2018-10-29 15:25:22 -07:00
Michael Schurter
2361c1904b tests: get tests building if not yet passing 2018-10-16 16:56:57 -07:00
Michael Schurter
7848acbea4 register drivers by default
Do not register mock_driver on release builds.
2018-10-16 16:56:56 -07:00
Nick Ethier
4f9522dd54 client: review comments and fixup/skip tests 2018-10-16 16:56:56 -07:00
Nick Ethier
ea9ed2282e client: refactor post allocrunnerv2 finalization 2018-10-16 16:56:56 -07:00
Nick Ethier
d335a82859 client: begin driver plugin integration
client: fingerprint driver plugins
2018-10-16 16:56:56 -07:00
Alex Dadgar
627e20801d Fix lints 2018-10-16 16:56:56 -07:00
Alex Dadgar
3a492bb33f allocrunnerv2 -> allocrunner 2018-10-16 16:56:56 -07:00
Alex Dadgar
2e535aefcc move files around 2018-10-16 16:56:55 -07:00
Michael Schurter
4d1a1ac5bb tests: test logs endpoint against pending task
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter
62e90cd2fa tests: test via ServeMux so http codes are set 2018-10-16 16:56:55 -07:00
Michael Schurter
d29d613c02 client: expose task state to client
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.

However, that would lead to AR having two awkwardly similar methods:
 - Alloc() - which returns the server-sent alloc
 - ClientAlloc() - which returns the fully materialized client alloc

Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?

Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.

This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?

So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!

 - AR.Alloc() is for static Allocation access (often for the Job)
 - AR.AllocState() is for the dynamic AR & TR runtime state (deployment
   status, task states, etc).

If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()

It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
2018-10-16 16:56:55 -07:00
Michael Schurter
334f2b496e tests: fix races caused by sharing a buffer
httptest.ResponseRecorder exposes a bytes.Buffer which we were reading
and writing concurrently to test streaming log APIs. This is a race, so
I wrapped the struct in a lock with some helpers.
2018-10-16 16:56:55 -07:00
Alex Dadgar
14cc4f7337 extra logging 2018-10-16 16:56:55 -07:00
Alex Dadgar
e2553a13d4 Fix client reloading and pass the plugin loaders to server and client 2018-10-16 16:56:55 -07:00
Alex Dadgar
7882ae4a1f Plugin loader initialization 2018-10-16 16:54:12 -07:00
Nick Ethier
5b14d24bf4 executor v2 (#4656)
* client/executor: refactor client to remove interpolation

* executor: POC libcontainer based executor

* vendor: use hashicorp libcontainer fork

* vendor: add libcontainer/nsenter dep

* executor: updated executor interface to simplify operations

* executor: implement logging pipe

* logmon: new logmon plugin to manage task logs

* driver/executor: use logmon for log management

* executor: fix tests and windows build

* executor: fix logging key names

* executor: fix test failures

* executor: add config field to toggle between using libcontainer and standard executors

* logmon: use discover utility to discover nomad executable

* executor: only call libcontainer-shim on main in linux

* logmon: use seperate path configs for stdout/stderr fifos

* executor: windows fixes

* executor: created reusable pid stats collection utility that can be used in an executor

* executor: update fifo.Open calls

* executor: fix build

* remove executor from docker driver

* executor: Shutdown func to kill and cleanup executor and its children

* executor: move linux specific universal executor funcs to seperate file

* move logmon initialization to a task runner hook

* client: doc fixes and renaming from code review


* taskrunner: use shared config struct for logmon fifo fields

* taskrunner: logmon only needs to be started once per task
2018-10-16 16:53:31 -07:00
Michael Schurter
76194c7414 consul service hook
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Alex Dadgar
5e67b37aad use int64 2018-10-16 15:34:32 -07:00
Preetha Appan
3ca71ae935 Change CPU/Disk/MemoryMB to int everywhere in new resource structs 2018-10-16 16:21:42 -05:00
Alex Dadgar
e9ddf2c533 parse affinities and constraints on devices 2018-10-11 14:05:19 -07:00
Alex Dadgar
e47ddbdd9f parse devices 2018-10-08 16:09:41 -07:00
Alex Dadgar
8b9b77dd59 Define device request structs 2018-10-08 15:38:03 -07:00
Alex Dadgar
e30b20e65e renames 2018-10-04 14:57:25 -07:00
Alex Dadgar
0f2f4797cb fixing tests 2018-10-04 14:26:19 -07:00
Alex Dadgar
49c2d4f775 Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Alex Dadgar
f969298854 Node reserved resources 2018-09-29 18:44:55 -07:00
Alex Dadgar
9688161a54 Fix autopilot set enable custom upgrades flag 2018-09-25 13:49:35 -07:00
Alex Dadgar
32f9da9e07 small fixes 2018-09-15 16:42:38 -07:00
Alex Dadgar
260b566c91 server 2018-09-15 16:23:13 -07:00
Alex Dadgar
40d095fd1a agent + consul 2018-09-13 10:43:40 -07:00
Alex Dadgar
45a88acf0a Merge pull request #4631 from hashicorp/f-plugin-config
Parse plugin configs
2018-09-04 17:04:13 -07:00
Alex Dadgar
a5a26db04a Merge pull request #4642 from hashicorp/b-vet
Fix vet errors and use newer go version in travis
2018-09-04 17:04:02 -07:00
Alex Dadgar
da0bec03c1 Fix make check errors 2018-09-04 16:03:52 -07:00
Preetha Appan
b8facd8756 Fix linting 2018-09-04 16:10:11 -05:00
Preetha Appan
d40338c8b9 Move topk and delay heap to separate packages under lib 2018-09-04 16:10:11 -05:00
Preetha Appan
f6cbfbfef6 Track top k nodes by norm score rather than top k nodes per scorer 2018-09-04 16:10:11 -05:00
Preetha Appan
aa2b632a44 Fix linting 2018-09-04 16:10:11 -05:00
Preetha Appan
4d68d935e4 Use heap to store top K scoring nodes.
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00