Commit Graph

12850 Commits

Author SHA1 Message Date
Preetha Appan
be57b3e84d Switch back to using map[string]string for port map 2018-10-16 16:56:56 -07:00
Michael Schurter
4175e908dd fixup comments, logging, and missing method impls
from #4777 comments
2018-10-16 16:56:56 -07:00
Michael Schurter
7848acbea4 register drivers by default
Do not register mock_driver on release builds.
2018-10-16 16:56:56 -07:00
Michael Schurter
089bce5ab4 drivers/mock: complete plugin impl 2018-10-16 16:56:56 -07:00
Nick Ethier
9bd696e2de drivers/mock: start mock driver implementation 2018-10-16 16:56:56 -07:00
Michael Schurter
2256917936 Port client portion of #4392 to new taskrunner
PR #4392 was merged to master *after* allocrunnerv2 was branched, so the
client-specific portions must be ported from master to arv2.
2018-10-16 16:56:56 -07:00
Michael Schurter
a44e82f326 tr: implement dispatch payload hook
Now passing the TaskDir struct to prestart hooks instead of just the
root task dir itself as dispatch needs local/.
2018-10-16 16:56:56 -07:00
Preetha Appan
f13a0943a3 make port map a slice of maps to match existing rkt driver 2018-10-16 16:56:56 -07:00
Preetha Appan
3c6d6b9377 Review comments 2018-10-16 16:56:56 -07:00
Preetha Appan
04a2aad209 Stats collection test 2018-10-16 16:56:56 -07:00
Preetha Appan
0ebc3bdd2f RKT driver plugin and unit tests 2018-10-16 16:56:56 -07:00
Nick Ethier
19b222b127 client: log retry during driver fingerprint redispense 2018-10-16 16:56:56 -07:00
Nick Ethier
2e055fe18a client: add test for driverfailure during fingerprinting 2018-10-16 16:56:56 -07:00
Nick Ethier
b751030765 rkt: start rkt driver plugin 2018-10-16 16:56:56 -07:00
Preetha Appan
53381035db Get raw exec tests compiling and passing again 2018-10-16 16:56:56 -07:00
Nick Ethier
993e045ff9 taskrunner: return error on waitCh 2018-10-16 16:56:56 -07:00
Nick Ethier
44cc52a0d4 client: simplify driver plugin logic from review comments 2018-10-16 16:56:56 -07:00
Nick Ethier
b016b2b5b0 plugin/driver: add Copy funcs 2018-10-16 16:56:56 -07:00
Nick Ethier
d68f2f0819 client: fix broked tests from refactoring 2018-10-16 16:56:56 -07:00
Nick Ethier
4f9522dd54 client: review comments and fixup/skip tests 2018-10-16 16:56:56 -07:00
Nick Ethier
dd3b2ef91c docklog: add go-plugin for forwarding of docker logs 2018-10-16 16:56:56 -07:00
Nick Ethier
ea9ed2282e client: refactor post allocrunnerv2 finalization 2018-10-16 16:56:56 -07:00
Nick Ethier
d335a82859 client: begin driver plugin integration
client: fingerprint driver plugins
2018-10-16 16:56:56 -07:00
Alex Dadgar
627e20801d Fix lints 2018-10-16 16:56:56 -07:00
Alex Dadgar
3c0b073513 compile on windows 2018-10-16 16:56:56 -07:00
Alex Dadgar
7b7cb382dc more test fixes 2018-10-16 16:56:56 -07:00
Alex Dadgar
3a492bb33f allocrunnerv2 -> allocrunner 2018-10-16 16:56:56 -07:00
Alex Dadgar
f91b269b2a fix test compiling 2018-10-16 16:56:55 -07:00
Alex Dadgar
31d49c72ab skip building deprecated files 2018-10-16 16:56:55 -07:00
Alex Dadgar
2e535aefcc move files around 2018-10-16 16:56:55 -07:00
Nick Ethier
3f7c14c7f5 drivers/shared: added func comment to eventer 2018-10-16 16:56:55 -07:00
Nick Ethier
db981de8e4 drivers/shared: move eventer to subpackage under drivers shared package 2018-10-16 16:56:55 -07:00
Nick Ethier
ca27a0254b drivers/utils: better handling of consumer cleanup in eventer 2018-10-16 16:56:55 -07:00
Nick Ethier
a8d50e83b1 plugins/drivers: remove bool to track if eventLoop shutdown and use context instead 2018-10-16 16:56:55 -07:00
Nick Ethier
207522be55 drivers/rawexec: PR comments and feedback 2018-10-16 16:56:55 -07:00
Nick Ethier
0d7bf53c57 plugin/drivers: rework eventer and change naming stream -> consumer 2018-10-16 16:56:55 -07:00
Michael Schurter
37387bbf6f tests: fix missing logger caused by bad merge 2018-10-16 16:56:55 -07:00
Michael Schurter
3b8da3065e tr: properly comment handle fields 2018-10-16 16:56:55 -07:00
Michael Schurter
8e9289676b ar: AllocState should not mutate ar.state
If ar.state.TaskStates has not been set, set it on the copy of ar.state.
That keeps ar.state manipulations in one location and allows AllocState
to only acquire read-locks.
2018-10-16 16:56:55 -07:00
Michael Schurter
4d1a1ac5bb tests: test logs endpoint against pending task
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter
01f057e35d tests: make a test client/config easier to generate
Sadly can't move the fingerprint timeout tweak into the helper due to
circular imports.
2018-10-16 16:56:55 -07:00
Michael Schurter
e495a0444b tests: ensure task state is initialized in NewAR
Also expose NoopDB for use in tests.
2018-10-16 16:56:55 -07:00
Michael Schurter
62e90cd2fa tests: test via ServeMux so http codes are set 2018-10-16 16:56:55 -07:00
Michael Schurter
d29d613c02 client: expose task state to client
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.

However, that would lead to AR having two awkwardly similar methods:
 - Alloc() - which returns the server-sent alloc
 - ClientAlloc() - which returns the fully materialized client alloc

Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?

Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.

This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?

So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!

 - AR.Alloc() is for static Allocation access (often for the Job)
 - AR.AllocState() is for the dynamic AR & TR runtime state (deployment
   status, task states, etc).

If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()

It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
2018-10-16 16:56:55 -07:00
Michael Schurter
737b1d82d2 client: add comment 2018-10-16 16:56:55 -07:00
Michael Schurter
99e2953e23 client: fix potentially dropped streaming errors 2018-10-16 16:56:55 -07:00
Michael Schurter
981acf3f95 tr: remove unneeded lock; chan synchronizes access 2018-10-16 16:56:55 -07:00
Michael Schurter
334f2b496e tests: fix races caused by sharing a buffer
httptest.ResponseRecorder exposes a bytes.Buffer which we were reading
and writing concurrently to test streaming log APIs. This is a race, so
I wrapped the struct in a lock with some helpers.
2018-10-16 16:56:55 -07:00
Michael Schurter
9f64add14c tr: fix shutdown/destroy/WaitResult handling
Multiple receivers raced for the WaitResult when killing tasks which
could lead to a deadlock if the "wrong" receiver won.

Wrap handlers in an ugly little proxy to avoid this. At first I wanted
to push this into drivers, but the result is tied to the TR's handle
lifecycle -- not the lifecycle of an alloc or task.
2018-10-16 16:56:55 -07:00
Michael Schurter
13f47aa521 client: do not inspect task state to follow logs
"Ask forgiveness, not permission."

Instead of peaking at TaskStates (which are no longer updated on the
AR.Alloc() view of the world) to only read logs for running tasks, just
try to read the logs and improve the error handling if they don't exist.

This should make log streaming less dependent on AR/TR behavior.

Also fixed a race where the log streamer could exit before reading an
error. This caused no logs or errors to be displayed sometimes when an
error occurred.
2018-10-16 16:56:55 -07:00