Commit Graph

44 Commits

Author SHA1 Message Date
Nick Ethier
bed9efae44 Merge branch 'master' into f-driver-upgradepath-test
* master: (23 commits)
  tests: avoid assertion in goroutine
  spell check
  ci: run checkscripts
  tests: deflake TestRktDriver_StartWaitRecoverWaitStop
  drivers/rkt: Remove unused github.com/rkt/rkt
  drivers/rkt: allow development on non-linux
  cli: Hide `nomad docker_logger` from help output
  api: test api and structs are in sync
  goimports until make check is happy
  nil check node resources to prevent panic
  tr: use context in as select statement
  move pluginutils -> helper/pluginutils
  vet
  goimports
  gofmt
  Split hclspec
  move hclutils
  Driver tests do not use hcl2/hcl, hclspec, or hclutils
  move reattach config
  loader and singleton
  ...
2019-01-23 21:01:24 -05:00
Nick Ethier
a9060f44eb drivers: add docker upgrade path and e2e test 2019-01-23 14:44:42 -05:00
Alex Dadgar
2d23f4a038 move reattach config 2019-01-22 15:11:58 -08:00
Nick Ethier
994c66f7d7 drivers: use consts for task handle version 2019-01-18 18:31:01 -05:00
Nick Ethier
2f91ac88f7 cleanup code comments and small fixes from refactor 2019-01-18 18:31:01 -05:00
Mahmood Ali
9f7619344e Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Preetha Appan
3e16b52361 clean up read access 2019-01-16 11:04:11 -06:00
Preetha Appan
80c00fc268 Refactor logging in drivers to use a tri-state boolean
Changes logging warnings/errors only if the state changes
from healthy to unhealthy
2019-01-16 10:19:31 -06:00
Preetha Appan
35ab26658c Make docker driver logging less redundant 2019-01-16 10:16:57 -06:00
oleksii.shyman
cc98f282d4 Add support for docker runtimes
- docker fingerprint issues a docker api system info call to get the
  list of supported OCI runtimes.
  - OCI runtimes are reported as comma separated list of names
  - docker driver is aware of GPU runtime presence
  - docker driver throws an error when user tries to run container with
  GPU, when GPU runtime is not present
  - docker GPU runtime name is configurable
2019-01-15 11:34:47 -08:00
Danielle Tomlinson
f9a4594095 docker: Terminate dockerlogger
Previously, we did not attempt to stop Docker Logger processes until
DestroyTask, which means that under many circumstances, we will never
successfully close the plugin client.

This commit terminates the plugin process when `run` terminates, or when
`DestroyTask` is called.

Steps to repro:

```
$ nomad agent -dev
$ nomad init
$ nomad run example.nomad
$ nomad stop example
$ ps aux | grep nomad # See docker logger process running
$ signal the dev agent
$ ps aux | grep nomad # See docker logger process running
```
2019-01-15 14:58:05 +01:00
Mahmood Ali
b5c20aa50b Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Danielle Tomlinson
476e44b4e4 drivers: Implement InternalPluginDriver interface
This implements the InternalPluginDriver interface in each driver, and
calls the cancellation fn for their respective eventers.

This fixes a per task goroutine leak during test suite execution.
2019-01-08 13:49:31 +01:00
Nick Ethier
6951ca487d drivermanager: use allocID and task name to route task events 2018-12-18 23:01:51 -05:00
Danielle Tomlinson
ad4bac8d77 docker: Delete Task on Destroy
Currently the docker driver does not remove tasks from its state map
when destroying the task, which leads to issues when restarting tasks in
place, and leaks expired handles over time.
2018-12-18 15:53:31 +01:00
Mahmood Ali
1678a8499b drivers/docker: enforce volumes.enabled (#4983)
When volumes.enable flag is off in Docker driver, disable all mounts of
paths outside alloc dir.
2018-12-11 14:22:50 -05:00
Mahmood Ali
04af097008 driver/docker: honor plugin devices 2018-12-04 21:31:28 -05:00
Mahmood Ali
65e9b3b8be refactor device manipulation 2018-12-04 20:55:59 -05:00
Danielle Tomlinson
b8b4ce2fa6 Merge pull request #4936 from hashicorp/f-legacy-refactor
Refactor and repackage client/driver
2018-11-30 13:38:06 +01:00
Mahmood Ali
ef424132d0 Merge pull request #4926 from hashicorp/f-docker-image-ref
Use user provided image name to launch container
2018-11-30 07:27:39 -05:00
Danielle Tomlinson
03db4cf82d client: Rename drivers/shared/env => client/taskenv 2018-11-30 12:18:39 +01:00
Danielle Tomlinson
6756ffd052 drivers: Move client/drivers/env to drivers/shared/env
As part of deprecating legacy drivers, we're moving the env package to a
new drivers/shared tree, as it is used by the modern docker and rkt
driver packages, and is useful for 3rd party plugins.
2018-11-30 10:46:13 +01:00
Mahmood Ali
2310600d91 Use user provided image name to launch container
This allows the container to be tagged with a user friendly image name
(e.g. `redis:3.2`) rather than the image ID (e.g.
`sha256:87856cc39862cec77541d68382e4867d7ccb29a85a17221446c857ddaebca916`).

Useful for human debugging, as well as some debugging and image scanning
tools.

This risks two bad changes:
1. Discrepancy in image resolution between docker and Nomad's image
loader.
  * I checked the image creation paths in Nomad, and noticed that we
either pulled the image or inspect the image with the user provided
name.

2. A race in image tagging where the tag is modified between image
loading and container creation.
  * I, personally, don't think this case is cause for concern, as it is
analogous to the task running a bit later.  As long as the image is
still present, creating the container should be good.
2018-11-27 16:12:15 -05:00
Mahmood Ali
9c89ea4e08 Support docker bind mounts 2018-11-27 07:20:17 -05:00
Mahmood Ali
2aa034e174 Merge pull request #4908 from hashicorp/f-docker-opts-storageopt
Add support for docker storage options
2018-11-20 21:08:27 -05:00
Nick Ethier
fc53f5f635 docker: sync access to exit result within a handle 2018-11-20 20:41:32 -05:00
Michael Schurter
ace09b3a84 Apply suggestions from code review
Co-Authored-By: nickethier <ncethier@gmail.com>
2018-11-20 20:33:31 -05:00
Mahmood Ali
2c7bea7190 Add support for storage opt 2018-11-20 16:11:02 -05:00
Nick Ethier
249dbfffd2 docker: move config RPCs to config.go 2018-11-19 22:59:18 -05:00
Nick Ethier
0462c8a7f8 docker: remove container pointer from task handle 2018-11-19 22:59:18 -05:00
Nick Ethier
d7631e3b23 docker: move volume driver options to seperate block 2018-11-19 22:59:18 -05:00
Nick Ethier
577d4a2ea8 docker: group common config into blocks 2018-11-19 22:59:17 -05:00
Michael Schurter
455e75492c Apply suggestions from code review
Co-Authored-By: nickethier <ncethier@gmail.com>
2018-11-19 22:59:17 -05:00
Nick Ethier
3468880f50 docker: remove global pull coordinator 2018-11-19 22:59:17 -05:00
Nick Ethier
0c62b9adf4 docker: moved fingerprint code to it's own file 2018-11-19 22:59:17 -05:00
Nick Ethier
3601e4241d plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver 2018-11-19 22:59:17 -05:00
Nick Ethier
37ed75502e docker: move recoverable error proto to shared structs 2018-11-19 22:59:16 -05:00
Nick Ethier
396f6ab1fb docker: implement recover task logic 2018-11-19 22:59:16 -05:00
Nick Ethier
902eb9475c docker: finished porting tests 2018-11-19 22:59:16 -05:00
Nick Ethier
5c777a37de drivers/docker: more work porting tests from old driver plugin 2018-11-19 22:59:16 -05:00
Nick Ethier
98b295d617 docker: started work on porting docker driver to new plugin framework 2018-11-19 22:59:15 -05:00