Commit Graph

3358 Commits

Author SHA1 Message Date
Nick Ethier
6352052211 client/plugin: remove println from plugin group func 2018-11-27 22:45:09 -05:00
Nick Ethier
6ead1a8f72 client/plugin: lint/spelling errors 2018-11-27 22:45:09 -05:00
Nick Ethier
5b11495b31 client/plugin: add generic plugin mananger interface and orchestration 2018-11-27 22:45:03 -05:00
Mahmood Ali
90daa8b2d3 Fixes in old lxc driver 2018-11-27 21:40:43 -05:00
Michael Schurter
e660046f06 Merge pull request #4896 from hashicorp/b-prevalloc-deadlock
Fix deadlock in previous alloc watcher by emitting last alloc update
2018-11-27 19:07:16 -06:00
Michael Schurter
3a61c2bb37 fix test breakage caused by rebase 2018-11-27 16:34:01 -08:00
Michael Schurter
518d03d7cd fix mispelings 2018-11-27 16:33:55 -08:00
Chris Baker
e3108507c5 Merge pull request #4891 from hashicorp/b-1150-rkt-volume-names
drivers/rkt: fix invalid volumes
2018-11-27 18:55:00 -05:00
Danielle Tomlinson
d361015562 Merge pull request #4909 from hashicorp/b-restart-delay
taskrunner: Return the restart delay correctly
2018-11-27 23:55:54 +01:00
Michael Schurter
748448970b client: comment on importance of chan ops ordering 2018-11-27 14:11:32 -08:00
Mahmood Ali
9e1d51659b Update client/structs/broadcaster.go
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2018-11-27 14:06:08 -08:00
Michael Schurter
ff7df70f32 client: fix send-after-close in broadcaster 2018-11-27 14:06:08 -08:00
Michael Schurter
17a611bf98 client: check if prev alloc is already terminated
This is a defensive fast-path as 7c6aa0be already fixed the deadlock.
2018-11-27 14:06:08 -08:00
Michael Schurter
6d49163b12 client: emit last sent alloc to new listeners
Fixes a deadlock where the allocwatcher would block forever waiting for
an update from a terminal alloc.

Made the broadcaster easier to debug as well.
2018-11-27 14:06:08 -08:00
Michael Schurter
5d6d4bf290 Merge pull request #4883 from hashicorp/f-graceful-shutdown
Support graceful shutdowns in agent
2018-11-27 15:55:15 -06:00
Michael Schurter
f06b2a872d client: fix races in use of goroutine group
The group utility struct does not support asynchronously launched
goroutines (goroutines-inside-of-goroutines), so switch those uses to a
normal go call.

This means watchNodeUpdates and watchNodeEvents may not be shutdown when
Shutdown() exits. During nomad agent shutdown this does not matter.

During tests this means a test may leak those goroutines or be unable to
know when those goroutines have exited.

Since there's no runtime impact and these goroutines do not affect alloc
state syncing it seems ok to risk leaking them.
2018-11-26 12:52:55 -08:00
Michael Schurter
0efa2d24e4 client: reuse group instead of diy'ing it 2018-11-26 12:52:31 -08:00
Michael Schurter
134c04744e client/ar: remove useless wait ch from runTasks
Arguably this makes task.WaitCh() useless, but I think exposing a wait
chan from TaskRunners is a generically useful API.
2018-11-26 12:51:18 -08:00
Michael Schurter
021c0cc4bf client: document how AR/TR Run methods behave 2018-11-26 12:50:35 -08:00
Chris Baker
790fe0b1db modified TaskConfig to include AllocID
use this for volume names in drivers/rkt to address #1150
2018-11-26 18:54:26 +00:00
Nick Ethier
66674116dc Merge pull request #4844 from hashicorp/f-docker-plugin
Docker driver plugin
2018-11-20 20:43:03 -05:00
Mahmood Ali
0fc84f4cfb address review comments 2018-11-20 17:10:54 -05:00
Mahmood Ali
88c1698ef5 Emit metric counters for Vault token and renewal failures 2018-11-20 17:10:54 -05:00
Mahmood Ali
feaf6214f9 Set User-Agent header when hitting Vault API 2018-11-20 17:10:54 -05:00
Danielle Tomlinson
d73c2f3c8d taskrunner: Return the restart delay correctly
We were incorrectly returning a 0 duration to the taskrunner when
determining when a task should restart. This would cause tasks to be
restarted immediately, ignoring the restart {} stanza in a users
configuration.

This commit causes us to return the restart duration to the task runner
so it may correctly delay further execution.
2018-11-20 21:52:23 +01:00
Nick Ethier
6dfa2309c4 task_runner: use NodeResources instead of deprecated struct 2018-11-20 13:46:39 -05:00
Nick Ethier
4dfcc80a0b task_runner: use task and alloc copies instead of referencing the original pointer 2018-11-20 13:34:46 -05:00
Nick Ethier
cf69895d31 task_runner: emit event on task exit with exit result details 2018-11-19 22:59:17 -05:00
Nick Ethier
3601e4241d plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver 2018-11-19 22:59:17 -05:00
Nick Ethier
9c154d7135 drivers: added NodeResources to drivers.TaskConfig 2018-11-19 22:59:16 -05:00
Nick Ethier
98b295d617 docker: started work on porting docker driver to new plugin framework 2018-11-19 22:59:15 -05:00
Michael Schurter
4b236cc4ce client.rpc: don't log errors on shutdown 2018-11-19 16:39:30 -08:00
Michael Schurter
31f113ba4d client: support graceful shutdowns
Client.Shutdown now blocks until all AllocRunners and TaskRunners have
exited their Run loops. Tasks are left running.
2018-11-19 16:39:30 -08:00
Mahmood Ali
6b7953af21 Merge pull request #4884 from hashicorp/f-alloc-devices-cli
Report alloc device statistics in API and CLI
2018-11-16 18:04:54 -05:00
Mahmood Ali
fd49039f09 address review comments 2018-11-16 17:13:01 -05:00
Mahmood Ali
58cbafe913 Populate alloc stats API with device stats
This change makes few compromises:

* Looks up the devices associated with tasks at look up time.  Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine.  However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.

* `client.Client` performs the task devices->statistics lookup.  It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
  * Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
  * The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)

* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
  * Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
  * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
2018-11-16 10:26:32 -05:00
Michael Schurter
9189173649 tests: fix tests post-rebase 2018-11-15 17:40:56 -08:00
Michael Schurter
a552a0c42c client/tr: add a bit of context to envbuilder errors 2018-11-15 16:26:25 -08:00
Michael Schurter
2283c8fff3 client: remove old proxy references from comments 2018-11-15 16:26:25 -08:00
Michael Schurter
926e3dc706 client: test more env key variations 2018-11-15 16:26:25 -08:00
Michael Schurter
b878bbf3f7 client: add new nested variables to task's hcl ctx
The error messages are really bad, but it's extremely difficult to
produce good error messages without the original HCL.
2018-11-15 16:26:25 -08:00
Michael Schurter
26211408a6 client: turn env into nested objects for task configs 2018-11-15 16:25:57 -08:00
Michael Schurter
43b359914b client: interpolate driver configurations
Also add missing SetDriverNetwork calls.
2018-11-15 16:25:57 -08:00
Mahmood Ali
04ecb5c72a Track Node Device attributes and serve them in API 2018-11-14 14:42:29 -05:00
Mahmood Ali
ba3fe15f7e Add Client Device Stats structs in api package 2018-11-14 14:41:19 -05:00
Mahmood Ali
5af9296bb4 Expose Device Stats in /client/stats API endpoint 2018-11-14 14:41:19 -05:00
Mahmood Ali
dd47c590f0 Allow nullable fields in StatValues
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).

We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations.  Thus, I opted not to use it.
2018-11-14 14:41:19 -05:00
Mahmood Ali
2f4c510cb7 Move Stat{Object|Value} to plugins/shared/structs
Moving them as they may be useful for other packages/plugins besides
devices.
2018-11-14 09:01:26 -05:00
Mahmood Ali
df694eb3be Regenerate proto files with protoc-gen-go@v1.2.0 2018-11-14 09:01:26 -05:00
Danielle Tomlinson
f16d96bdd8 Merge pull request #4869 from hashicorp/b-executor-stdout
executor: Fix stdout stderr copy/paste
2018-11-13 19:22:37 -08:00