Commit Graph

23 Commits

Author SHA1 Message Date
Mahmood Ali
943854469d driver: allow disabling log collection
Operators commonly have docker logs aggregated using various tools and
don't need nomad to manage their docker logs.  Worse, Nomad uses a
somewhat heavy docker api call to collect them and it seems to cause
problems when a client runs hundreds of log collections.

Here we add a knob to disable log aggregation completely for nomad.
When log collection is disabled, we avoid running logmon and
docker_logger for the docker tasks in this implementation.

The downside here is once disabled, `nomad logs ...` commands and API
no longer return logs and operators must corrolate alloc-ids with their
aggregated log info.

This is meant as a stop gap measure.  Ideally, we'd follow up with at
least two changes:

First, we should optimize behavior when we can such that operators don't
need to disable docker log collection.  Potentially by reverting to
using pre-0.9 syslog aggregation in linux environments, though with
different trade-offs.

Second, when/if logs are disabled, nomad logs endpoints should lookup
docker logs api on demand.  This ensures that the cost of log collection
is paid sparingly.
2019-12-08 14:15:03 -05:00
Chris Baker
3a96683131 docker: DestroyTask was not cleaning up Docker images because it was erroring early due to an attempt to inspect an image that had already been removed 2019-06-03 19:04:27 +00:00
Alex Dadgar
4acaf27ef5 Don't fall through 2019-01-28 09:53:19 -08:00
Alex Dadgar
ec5712345f comment 2019-01-28 09:47:53 -08:00
Alex Dadgar
65aa214687 Fix killing non-existant container with a kill timeout 2019-01-25 16:21:51 -08:00
Alex Dadgar
2d23f4a038 move reattach config 2019-01-22 15:11:58 -08:00
Danielle Tomlinson
f9a4594095 docker: Terminate dockerlogger
Previously, we did not attempt to stop Docker Logger processes until
DestroyTask, which means that under many circumstances, we will never
successfully close the plugin client.

This commit terminates the plugin process when `run` terminates, or when
`DestroyTask` is called.

Steps to repro:

```
$ nomad agent -dev
$ nomad init
$ nomad run example.nomad
$ nomad stop example
$ ps aux | grep nomad # See docker logger process running
$ signal the dev agent
$ ps aux | grep nomad # See docker logger process running
```
2019-01-15 14:58:05 +01:00
Nick Ethier
f6af1d4d04 docker: add test for stats collection 2019-01-12 12:18:22 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Mahmood Ali
f248fefdbf driver/docker: stopping a dead container not error 2018-12-15 15:03:56 -05:00
Danielle Tomlinson
756325bcbd client: Merge driver/shared/structs and client/structs 2018-11-30 10:56:45 +01:00
Danielle Tomlinson
23197ec6b4 drivers: Create drivers/shared/structs
This creates a drivers/shared/structs package and moves the buffer size
checks into it.
2018-11-30 10:46:13 +01:00
Mahmood Ali
ef52080b0a Formatting and typo fixes 2018-11-25 11:53:21 -05:00
Nick Ethier
fc53f5f635 docker: sync access to exit result within a handle 2018-11-20 20:41:32 -05:00
Nick Ethier
0462c8a7f8 docker: remove container pointer from task handle 2018-11-19 22:59:18 -05:00
Nick Ethier
7d09ae57cc docker: remove call to global metrics instance 2018-11-19 22:59:17 -05:00
Nick Ethier
0c62b9adf4 docker: moved fingerprint code to it's own file 2018-11-19 22:59:17 -05:00
Nick Ethier
37ed75502e docker: move recoverable error proto to shared structs 2018-11-19 22:59:16 -05:00
Nick Ethier
396f6ab1fb docker: implement recover task logic 2018-11-19 22:59:16 -05:00
Nick Ethier
5c777a37de drivers/docker: more work porting tests from old driver plugin 2018-11-19 22:59:16 -05:00
Nick Ethier
98b295d617 docker: started work on porting docker driver to new plugin framework 2018-11-19 22:59:15 -05:00