Commit Graph

55 Commits

Author SHA1 Message Date
Nick Ethier
7d80fe286f drivers: use consts for task handle version 2019-01-16 21:52:31 -05:00
Nick Ethier
9ce0347e59 driver: add pre09 migration logic 2019-01-15 16:57:09 -05:00
Nick Ethier
12ae83bb79 executor: add pre 0.9 client and wrapper 2019-01-15 16:55:12 -05:00
Danielle Tomlinson
9cf6e1ae27 Merge pull request #5192 from hashicorp/dani/executor-close
executor: Always close stdout/stderr fifos
2019-01-15 17:49:04 +01:00
Danielle Tomlinson
17dbace46b executor: Always close stdout/stderr fifos 2019-01-15 16:47:27 +01:00
Mahmood Ali
2cc4466573 propogate logs to executor plugin 2019-01-15 08:25:03 -05:00
Alex Dadgar
109c5ef650 Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Nick Ethier
72a4685534 drivers: plumb grpc client logger 2019-01-12 12:18:23 -05:00
Nick Ethier
9904463da2 executor: fix failing stats related test 2019-01-12 12:18:23 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Alex Dadgar
270ae48b82 Plugins use parent loggers
This PR fixes various instances of plugins being launched without using
the parent loggers. This meant that logs would not all go to the same
output, break formatting etc.
2019-01-11 11:36:37 -08:00
Mahmood Ali
d1fbd735f3 Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Mahmood Ali
76d40947d7 Merge pull request #5159 from hashicorp/r-macos-tests
Fix Travis MacOS job
2019-01-09 08:22:30 -05:00
Mahmood Ali
f8f285248d Merge pull request #5154 from hashicorp/f-revert-exec-devs
drivers/exec: restrict devices exposed to tasks
2019-01-08 12:43:06 -05:00
Mahmood Ali
060588da31 executor: add a comment detailing isolation 2019-01-08 12:10:26 -05:00
Mahmood Ali
34ee0ba6b9 Remove some dead code 2019-01-08 09:11:48 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
694e3010c2 use drivers.FSIsolation 2019-01-08 09:11:47 -05:00
Alex Dadgar
19e67a0916 Test recovery 2019-01-07 14:49:41 -08:00
Mahmood Ali
d28f7ecca7 tests: busybox only depends on arch
Busybox is compiled for linux only.  Making the file used in executor
tests even for non-linux targets, as having the file present has no
side-effects.
2019-01-07 08:36:32 -05:00
Mahmood Ali
c26dfb000c drivers/exec: restrict devices exposed to tasks
We ultimately decided to provide a limited set of devices in exec/java
drivers instead of all of host ones.  Pre-0.9, we made all host devices
available to exec tasks accidentally, yet most applications only use a
small subset, and this choice limits our ability to restrict/isolate GPU
and other devices.

Starting with 0.9, by default, we only provide the same subset of
devices Docker provides, and allow users to provide more devices as
needed on case-by-case basis.

This reverts commit 5805c64a9f.
This reverts commit ff9a4a17e5.
2019-01-06 17:03:19 -05:00
Mahmood Ali
5805c64a9f driver/exec: use dedicated /dev mount (#5147)
Use a dedicated /dev mount so we can inject more devices if necessary,
and avoid allowing a container to contaminate host /dev.

Follow up to https://github.com/hashicorp/nomad/pull/5143 - and fixes master.
2019-01-04 10:36:05 -05:00
Mahmood Ali
ff9a4a17e5 drivers/exec: bind mount /dev into rootfs
Restores pre-0.9 behavior, where Nomad makes /dev available to exec
task.  Switching to libcontainer, we accidentally made only a small
subset available.

Here, we err on the side of preserving behavior of 0.8, instead of going
for the sensible route, where only a reasonable subset of devices is
mounted by default and user can opt to request more.
2019-01-03 14:29:18 -05:00
Alex Dadgar
d5512c39f0 Lint 2018-12-18 15:50:44 -08:00
Alex Dadgar
cd6879409c Drivers 2018-12-18 15:50:11 -08:00
Alex Dadgar
e1cf3ac69e protos 2018-12-18 15:48:52 -08:00
Nick Ethier
81ba18d74a executor: encode mounts and devices correctly when using grpc 2018-12-15 00:08:23 -05:00
Nick Ethier
c8a3c0e96e executor: use int when encoding signal in RPC 2018-12-14 22:20:01 -05:00
Nick Ethier
8a344412e8 Merge branch 'master' into f-grpc-executor
* master: (71 commits)
  Fix output of 'nomad deployment fail' with no arg
  Always create a running allocation when testing task state
  tests: ensure exec tests pass valid task resources (#4992)
  some changes for more idiomatic code
  fix iops related tests
  fixed bug in loop delay
  gofmt
  improved code for readability
  client: updateAlloc release lock after read
  fixup! device attributes in `nomad node status -verbose`
  drivers/exec: support device binds and mounts
  fix iops bug and increase test matrix coverage
  tests: tag image explicitly
  changelog
  ci: install lxc-templates explicitly
  tests: skip checking rdma cgroup
  ci: use Ubuntu 16.04 (Xenial) in TravisCI
  client: update driver info on new fingerprint
  drivers/docker: enforce volumes.enabled (#4983)
  client: Style: use fluent style for building loggers
  ...
2018-12-13 14:41:09 -05:00
Mahmood Ali
97f33bb153 drivers/exec: support device binds and mounts 2018-12-11 18:35:21 -05:00
Mahmood Ali
51707199a6 Merge pull request #4975 from hashicorp/fix-master-20181209
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Alex Dadgar
f42c060d35 Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali
c02dbc7f67 add a note about busybox license 2018-12-11 09:35:26 -05:00
Mahmood Ali
06a4b4add2 tests: prevent indefinite blocking in some tests
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.

I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Mahmood Ali
d6e708fe2d tests: setup libcontainer rootfs
Using statically linked busybox binary to setup a basic rootfs for
testing, by symlinking it to provide the basic commands used in tests.

I considered using a proper rootfs tarball, but the overhead of managing
tarfile and expanding it seems significant enough that I went with this
implementation.
2018-12-11 09:35:26 -05:00
Mahmood Ali
2fb5e35012 Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill
executor: kill all container processes
2018-12-08 09:52:42 -05:00
Nick Ethier
4243b7d5f3 executor: misspell 2018-12-08 01:52:06 -05:00
Nick Ethier
a6cb63a964 executor: don't drop errors when configuring libcontainer cfg, add nil check on resources 2018-12-07 14:03:42 -05:00
Nick Ethier
6b39bb33b6 Merge branch 'master' into f-grpc-executor 2018-12-06 21:42:38 -05:00
Nick Ethier
9e3c2492e5 executor: fix tests 2018-12-06 21:39:53 -05:00
Nick Ethier
ff1990064b executor: fix broken non-linux build 2018-12-06 21:33:20 -05:00
Nick Ethier
224c68860d executor: use drivers.Resources as resource model 2018-12-06 21:22:02 -05:00
Nick Ethier
13b582fcfb executor: merge plugin shim with executor package 2018-12-06 21:13:45 -05:00
Nick Ethier
0087a51a7a executor: remove structs package 2018-12-06 20:54:14 -05:00
Alex Dadgar
0953d913ed Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Nick Ethier
8690d0ae75 executor: update test references 2018-12-05 11:07:48 -05:00
Nick Ethier
467930f650 executor: use grpc instead of netrpc as plugin protocol
* Added protobuf spec for executor
 * Seperated executor structs into their own package
2018-12-05 11:03:56 -05:00
Mahmood Ali
e9dc31c68f executor: Keep 0.8.6 exit code for wait() failures
0.8.6 uses exit code 1 when `proc.Wait()` fails: https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/executor/executor.go#L442
2018-12-04 19:38:25 -05:00
Mahmood Ali
f6efac6c12 no t.Parallel() in excutor table driven tests (#4948)
When `t.Parallel()` is used inside a `t.Run()` sub-set, the closure
doesn't behave as expected, and some cases effectively get skipped.
More details can be found in
https://gist.github.com/posener/92a55c4cd441fc5e5e85f27bca008721
2018-12-04 09:04:04 -05:00
Mahmood Ali
781d8558f2 Kill all container processes on shutdown
Currently, libcontainer-based executor, upon shutdown, kills the
container initial process.  The children of the killed process remain
running, and the executor is never marked as terminated until they do.

Also, fix a case where we treat processes as successful, when
`proc.Wait()` fails.  In some attempts, I was getting "waitid no child
processes" errors and such error shouldn't get process to be considered
successful.
2018-12-03 20:40:49 -05:00