Commit Graph

132 Commits

Author SHA1 Message Date
Nick Ethier
9c154d7135 drivers: added NodeResources to drivers.TaskConfig 2018-11-19 22:59:16 -05:00
Nick Ethier
98b295d617 docker: started work on porting docker driver to new plugin framework 2018-11-19 22:59:15 -05:00
Michael Schurter
31f113ba4d client: support graceful shutdowns
Client.Shutdown now blocks until all AllocRunners and TaskRunners have
exited their Run loops. Tasks are left running.
2018-11-19 16:39:30 -08:00
Mahmood Ali
6b7953af21 Merge pull request #4884 from hashicorp/f-alloc-devices-cli
Report alloc device statistics in API and CLI
2018-11-16 18:04:54 -05:00
Mahmood Ali
fd49039f09 address review comments 2018-11-16 17:13:01 -05:00
Mahmood Ali
58cbafe913 Populate alloc stats API with device stats
This change makes few compromises:

* Looks up the devices associated with tasks at look up time.  Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine.  However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.

* `client.Client` performs the task devices->statistics lookup.  It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
  * Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
  * The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)

* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
  * Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
  * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
2018-11-16 10:26:32 -05:00
Michael Schurter
a552a0c42c client/tr: add a bit of context to envbuilder errors 2018-11-15 16:26:25 -08:00
Michael Schurter
b878bbf3f7 client: add new nested variables to task's hcl ctx
The error messages are really bad, but it's extremely difficult to
produce good error messages without the original HCL.
2018-11-15 16:26:25 -08:00
Michael Schurter
43b359914b client: interpolate driver configurations
Also add missing SetDriverNetwork calls.
2018-11-15 16:25:57 -08:00
Michael Schurter
740ca8e6ca client: fix tr lifecycle logic and shutdown delay
ShutdownDelay must be honored whenever the task is killed or restarted.
Services were not being deregistered prior to restarting.
2018-11-05 12:32:05 -08:00
Michael Schurter
fdbe446ea6 client: first pass at implementing task restoring
Task restoring works but dead tasks may be restarted
2018-11-05 12:32:05 -08:00
Nick Ethier
da7563b8c3 Merge pull request #4795 from hashicorp/f-plugin-config
Pass client configuration to plugins through loader
2018-10-29 18:42:27 -07:00
Michael Schurter
d71e7666bd ar: fix leader handling, state restoring, and destroying unrun ARs
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
  tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
2018-10-19 09:45:45 -07:00
Michael Schurter
2aed3e8527 ar: refactor task killing into 1 method
Update comments and address some PR comments from #4775
2018-10-17 10:06:59 -07:00
Michael Schurter
2417ec5621 ar: fix task leader, update, and stop handling 2018-10-17 10:06:59 -07:00
Nick Ethier
3244a4cc57 plumb NomadConfig into plugins 2018-10-16 22:47:22 -04:00
Michael Schurter
cf42289c8b fix linter errors 2018-10-16 16:56:57 -07:00
Michael Schurter
e026d6e80a tr: remove unused DriverHandle interface
was causing typed nil interface panics and served no purpose
2018-10-16 16:56:56 -07:00
Michael Schurter
2256917936 Port client portion of #4392 to new taskrunner
PR #4392 was merged to master *after* allocrunnerv2 was branched, so the
client-specific portions must be ported from master to arv2.
2018-10-16 16:56:56 -07:00
Nick Ethier
44cc52a0d4 client: simplify driver plugin logic from review comments 2018-10-16 16:56:56 -07:00
Nick Ethier
4f9522dd54 client: review comments and fixup/skip tests 2018-10-16 16:56:56 -07:00
Nick Ethier
ea9ed2282e client: refactor post allocrunnerv2 finalization 2018-10-16 16:56:56 -07:00
Alex Dadgar
627e20801d Fix lints 2018-10-16 16:56:56 -07:00
Alex Dadgar
3a492bb33f allocrunnerv2 -> allocrunner 2018-10-16 16:56:56 -07:00
Alex Dadgar
2e535aefcc move files around 2018-10-16 16:56:55 -07:00
Michael Schurter
c0d1b63b75 wrap boltdb in a write deduplicator
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.

$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4             500           4059591 ns/op           23736 B/op         56 allocs/op
BenchmarkWriteDeduplication_Off-4            300           4115319 ns/op           25942 B/op         55 allocs/op
2018-10-16 16:53:30 -07:00
Michael Schurter
8958919a5f removing old restoration path before api change 2018-10-16 16:53:30 -07:00
Michael Schurter
97dbf99d3f call handle.Network() instead of storing it 2018-10-16 16:53:30 -07:00
Michael Schurter
76194c7414 consul service hook
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Michael Schurter
15c0731096 artifact task hook 2018-10-16 16:53:29 -07:00
Andrei Burd
4e352f3814 Parametrized/periodic jobs per child tagged metric emmision 2018-06-21 10:40:56 +03:00
Alex Dadgar
a62e412b88 Refactor - wip 2018-06-12 10:23:45 -07:00