Commit Graph

62 Commits

Author SHA1 Message Date
Chris Baker
7b6d233617 docker/driver: downgraded log level for error in DestroyTask 2019-06-03 21:21:32 +00:00
Chris Baker
262c863a8b drivers/docker: modify container/image cleanup to be robust to containers removed out of band 2019-06-03 19:52:28 +00:00
Chris Baker
3a96683131 docker: DestroyTask was not cleaning up Docker images because it was erroring early due to an attempt to inspect an image that had already been removed 2019-06-03 19:04:27 +00:00
Mahmood Ali
67160a6302 drivers/docker: implement streaming exec 2019-05-09 16:49:08 -04:00
Mahmood Ali
fce6564ce4 driver/docker: Support volumes field in Windows
Support Docker `volumes` field in Windows.  Previously, volumes parser
assumed some Unix-ism (e.g. didn't expect `:` in mount paths).
Here, we use the Docker parser to identify host and container paths.

Docker parsers use different validation logic from our previous unix
implementation: Docker parser accepts single path as a volume entry
(parsing it as a container path with auto-created volume) and enforces
additional checks (e.g. validity of mode).  Thereforce, I opted to use
Docker parser only for Windows, and keep Nomad's linux parser to
preserve current behavior.
2019-04-25 09:02:44 -04:00
Mahmood Ali
ce8a8a5326 driver/docker: collect tty container logs
Fixes https://github.com/hashicorp/nomad/issues/5475

When container is a tty container, we need to get raw terminal output
without any additional processing.
2019-04-24 22:01:51 -04:00
Danielle Lancashire
ccce364cbd Switch to pre-0.9 behaviour for handling volumes
In Nomad 0.9, we made volume driver handling the same for `""`, and
`"local"` volumes. Prior to Nomad 0.9 however these had slightly different
behaviour for relative paths and named volumes.

Prior to 0.9 the empty string would expand relative paths within the task
dir, and `"local"` volumes that are not absolute paths would be treated
as docker named volumes.

This commit reverts to the previous behaviour as follows:

| Nomad Version | Driver  |   Volume Spec    | Behaviour                 |
|-------------------------------------------------------------------------
| all           | ""      | testing:/testing | allocdir/testing          |
| 0.8.7         | "local" | testing:/testing | "testing" as named volume |
| 0.9.0         | "local" | testing:/testing | allocdir/testing          |
| 0.9.1         | "local" | testing:/testing | "testing" as named volume |
2019-04-18 14:28:45 +02:00
Mahmood Ali
a09e3bf1a1 Allow compiling without nvidia integration
nvidia library use of dynamic library seems to conflict with alpine and
musl based OSes.  This adds a `nonvidia` tag to allow compiling nomad
for alpine images.

The nomad releases currently only support glibc based OS environments,
so we default to compiling with nvidia.
2019-04-10 09:19:12 -04:00
Nick Ethier
a936f2575a drivers/docker: fix image name handleing when prefixed with https:// 2019-04-04 22:10:18 -04:00
Michael Schurter
010a1973a9 docker: restore pre-0.9 container names
As far as I can tell Nomad itself does not use the container name after
container creation, so this should be safe.

OP: https://groups.google.com/d/topic/nomad-tool/kYkyERfVRXE/discussion
v0.8.7 code: https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L1530-L1531
2019-03-29 13:55:43 -07:00
Mahmood Ali
011315ba4c logging.Type over logging.Driver 2019-02-28 16:40:18 -05:00
Mahmood Ali
314d7a0f41 drivers/docker: rename logging type to driver
Docker uses the term logging `driver` in its public documentations: in
`docker` daemon config[1], `docker run` arguments [2] and in docker compose file[3].
Interestingly, docker used `type` in its API [4] instead of everywhere
else.

It's unfortunate that Nomad used `type` modeling after the Docker API
rather than the user facing documents.  Nomad using `type` feels very
non-user friendly as it's disconnected from how Docker markets the flag
and shows internal representation instead.

Here, we rectify the situation by introducing `driver` field and
prefering it over `type` in logging.

[1] https://docs.docker.com/config/containers/logging/configure/
[2] https://docs.docker.com/engine/reference/run/#logging-drivers---log-driver
[3] https://docs.docker.com/compose/compose-file/#logging
[4] https://docs.docker.com/engine/api/v1.39/#operation/ContainerCreate
2019-02-28 16:04:03 -05:00
Danielle Tomlinson
2657bf02c0 Merge pull request #5355 from hashicorp/dani/windows-dockerstats
docker: Support Stats on Windows
2019-02-26 16:39:48 +01:00
Danielle Tomlinson
6c774e7b46 docker: Return undetected before first detection
This commit causes the docker driver to return undetected before it
first establishes a connection to the docker daemon.

This fixes a bug where hosts without docker installed would return as
unhealthy, rather than undetected.
2019-02-25 11:02:42 +01:00
Danielle Tomlinson
6624d3667b docker: Support stats on Windows 2019-02-22 14:19:58 +01:00
Danielle Tomlinson
df57099e6f docker: Avoid leaking containers during Reattach
Currently if a docker_logger cannot be reattached to, we will leak the
container that was being used. This is problematic if e.g using static
ports as it means you can never recover your task, or if a service is
expensive to run and will then be running without supervision.
2019-02-20 17:47:06 +01:00
Danielle Tomlinson
274a3485b2 docker: Respawn docker logger during recovery
Sometimes the nomad docker_logger may be killed by a service manager
when restarting the client for upgrades or reliability reasons.

Currently if this happens, we leak the users container and try to
reschedule over it.

This commit adds a new step to the recovery process that will spawn a
new docker logger process that will fetch logs from _the current
timestamp_. This is to avoid restarting users tasks because our logging
sidecar has failed.
2019-02-20 17:12:56 +01:00
Danielle Tomlinson
e2244cd0d4 drivers/docker: SIGTERM to stop containers
Windows Docker daemon does not support SIGINT, SIGTERM is the semantic
equivalent that allows for graceful shutdown before being followed up by
a SIGKILL.
2019-02-14 15:38:54 +00:00
Nick Ethier
bed9efae44 Merge branch 'master' into f-driver-upgradepath-test
* master: (23 commits)
  tests: avoid assertion in goroutine
  spell check
  ci: run checkscripts
  tests: deflake TestRktDriver_StartWaitRecoverWaitStop
  drivers/rkt: Remove unused github.com/rkt/rkt
  drivers/rkt: allow development on non-linux
  cli: Hide `nomad docker_logger` from help output
  api: test api and structs are in sync
  goimports until make check is happy
  nil check node resources to prevent panic
  tr: use context in as select statement
  move pluginutils -> helper/pluginutils
  vet
  goimports
  gofmt
  Split hclspec
  move hclutils
  Driver tests do not use hcl2/hcl, hclspec, or hclutils
  move reattach config
  loader and singleton
  ...
2019-01-23 21:01:24 -05:00
Nick Ethier
a9060f44eb drivers: add docker upgrade path and e2e test 2019-01-23 14:44:42 -05:00
Alex Dadgar
2d23f4a038 move reattach config 2019-01-22 15:11:58 -08:00
Nick Ethier
994c66f7d7 drivers: use consts for task handle version 2019-01-18 18:31:01 -05:00
Nick Ethier
2f91ac88f7 cleanup code comments and small fixes from refactor 2019-01-18 18:31:01 -05:00
Mahmood Ali
9f7619344e Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Preetha Appan
3e16b52361 clean up read access 2019-01-16 11:04:11 -06:00
Preetha Appan
80c00fc268 Refactor logging in drivers to use a tri-state boolean
Changes logging warnings/errors only if the state changes
from healthy to unhealthy
2019-01-16 10:19:31 -06:00
Preetha Appan
35ab26658c Make docker driver logging less redundant 2019-01-16 10:16:57 -06:00
oleksii.shyman
cc98f282d4 Add support for docker runtimes
- docker fingerprint issues a docker api system info call to get the
  list of supported OCI runtimes.
  - OCI runtimes are reported as comma separated list of names
  - docker driver is aware of GPU runtime presence
  - docker driver throws an error when user tries to run container with
  GPU, when GPU runtime is not present
  - docker GPU runtime name is configurable
2019-01-15 11:34:47 -08:00
Danielle Tomlinson
f9a4594095 docker: Terminate dockerlogger
Previously, we did not attempt to stop Docker Logger processes until
DestroyTask, which means that under many circumstances, we will never
successfully close the plugin client.

This commit terminates the plugin process when `run` terminates, or when
`DestroyTask` is called.

Steps to repro:

```
$ nomad agent -dev
$ nomad init
$ nomad run example.nomad
$ nomad stop example
$ ps aux | grep nomad # See docker logger process running
$ signal the dev agent
$ ps aux | grep nomad # See docker logger process running
```
2019-01-15 14:58:05 +01:00
Mahmood Ali
b5c20aa50b Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Danielle Tomlinson
476e44b4e4 drivers: Implement InternalPluginDriver interface
This implements the InternalPluginDriver interface in each driver, and
calls the cancellation fn for their respective eventers.

This fixes a per task goroutine leak during test suite execution.
2019-01-08 13:49:31 +01:00
Nick Ethier
6951ca487d drivermanager: use allocID and task name to route task events 2018-12-18 23:01:51 -05:00
Danielle Tomlinson
ad4bac8d77 docker: Delete Task on Destroy
Currently the docker driver does not remove tasks from its state map
when destroying the task, which leads to issues when restarting tasks in
place, and leaks expired handles over time.
2018-12-18 15:53:31 +01:00
Mahmood Ali
1678a8499b drivers/docker: enforce volumes.enabled (#4983)
When volumes.enable flag is off in Docker driver, disable all mounts of
paths outside alloc dir.
2018-12-11 14:22:50 -05:00
Mahmood Ali
04af097008 driver/docker: honor plugin devices 2018-12-04 21:31:28 -05:00
Mahmood Ali
65e9b3b8be refactor device manipulation 2018-12-04 20:55:59 -05:00
Danielle Tomlinson
b8b4ce2fa6 Merge pull request #4936 from hashicorp/f-legacy-refactor
Refactor and repackage client/driver
2018-11-30 13:38:06 +01:00
Mahmood Ali
ef424132d0 Merge pull request #4926 from hashicorp/f-docker-image-ref
Use user provided image name to launch container
2018-11-30 07:27:39 -05:00
Danielle Tomlinson
03db4cf82d client: Rename drivers/shared/env => client/taskenv 2018-11-30 12:18:39 +01:00
Danielle Tomlinson
6756ffd052 drivers: Move client/drivers/env to drivers/shared/env
As part of deprecating legacy drivers, we're moving the env package to a
new drivers/shared tree, as it is used by the modern docker and rkt
driver packages, and is useful for 3rd party plugins.
2018-11-30 10:46:13 +01:00
Mahmood Ali
2310600d91 Use user provided image name to launch container
This allows the container to be tagged with a user friendly image name
(e.g. `redis:3.2`) rather than the image ID (e.g.
`sha256:87856cc39862cec77541d68382e4867d7ccb29a85a17221446c857ddaebca916`).

Useful for human debugging, as well as some debugging and image scanning
tools.

This risks two bad changes:
1. Discrepancy in image resolution between docker and Nomad's image
loader.
  * I checked the image creation paths in Nomad, and noticed that we
either pulled the image or inspect the image with the user provided
name.

2. A race in image tagging where the tag is modified between image
loading and container creation.
  * I, personally, don't think this case is cause for concern, as it is
analogous to the task running a bit later.  As long as the image is
still present, creating the container should be good.
2018-11-27 16:12:15 -05:00
Mahmood Ali
9c89ea4e08 Support docker bind mounts 2018-11-27 07:20:17 -05:00
Mahmood Ali
2aa034e174 Merge pull request #4908 from hashicorp/f-docker-opts-storageopt
Add support for docker storage options
2018-11-20 21:08:27 -05:00
Nick Ethier
fc53f5f635 docker: sync access to exit result within a handle 2018-11-20 20:41:32 -05:00
Michael Schurter
ace09b3a84 Apply suggestions from code review
Co-Authored-By: nickethier <ncethier@gmail.com>
2018-11-20 20:33:31 -05:00
Mahmood Ali
2c7bea7190 Add support for storage opt 2018-11-20 16:11:02 -05:00
Nick Ethier
249dbfffd2 docker: move config RPCs to config.go 2018-11-19 22:59:18 -05:00