Commit Graph

13474 Commits

Author SHA1 Message Date
Alex Dadgar
aa59ea6ac7 fix iops bug and increase test matrix coverage 2018-12-11 15:28:21 -08:00
Mahmood Ali
926428fe0f Merge pull request #4984 from hashicorp/b-client-update-driver
client: update driver info on new driver fingerprint
2018-12-11 18:01:03 -05:00
Mahmood Ali
5f11000714 Merge pull request #4985 from hashicorp/test-with-xenial
ci: Test with Ubuntu 16.04 in TravisCI
2018-12-11 18:00:39 -05:00
Mahmood Ali
51707199a6 Merge pull request #4975 from hashicorp/fix-master-20181209
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Mahmood Ali
e716c451a9 tests: tag image explicitly 2018-12-11 17:59:45 -05:00
Alex Dadgar
ce01603e04 changelog 2018-12-11 12:52:45 -08:00
Alex Dadgar
f42c060d35 Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali
d0215f4230 ci: install lxc-templates explicitly
LXC package on Ubuntu 16.04 doesn't depend on lxc-template, but we
require it in our tests.
2018-12-11 15:49:11 -05:00
Mahmood Ali
8f2454029a tests: skip checking rdma cgroup
rdma was added in most recent kernels and libcontainer/docker
don't isolate them by default.
2018-12-11 15:49:11 -05:00
Mahmood Ali
338683a4db ci: use Ubuntu 16.04 (Xenial) in TravisCI 2018-12-11 15:49:11 -05:00
Mahmood Ali
cae36e49a6 client: update driver info on new fingerprint
Fixes a bug where a driver health and attributes are never updated from
their initial status.  If a driver started unhealthy, it may never go
into a healthy status.
2018-12-11 14:25:10 -05:00
Mahmood Ali
1678a8499b drivers/docker: enforce volumes.enabled (#4983)
When volumes.enable flag is off in Docker driver, disable all mounts of
paths outside alloc dir.
2018-12-11 14:22:50 -05:00
Danielle Tomlinson
0a2fba0d0c Merge pull request #4963 from hashicorp/dani/f-preempt-alloc-wait
client: Wait for preemptions to terminate
2018-12-11 18:06:34 +01:00
Danielle Tomlinson
971586d73c client: Style: use fluent style for building loggers 2018-12-11 18:03:45 +01:00
Danielle Tomlinson
cbdc8f4c32 client: Correctly pass a noop PrevAllocMigrator when restoring 2018-12-11 15:46:58 +01:00
Mahmood Ali
c02dbc7f67 add a note about busybox license 2018-12-11 09:35:26 -05:00
Mahmood Ali
8a752066f8 tests: no need for buffer channel 2018-12-11 09:35:26 -05:00
Mahmood Ali
06a4b4add2 tests: prevent indefinite blocking in some tests
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.

I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Mahmood Ali
d31e52b1f6 tests: update stop/kill tests with new pattern
Update rawexec and rkt stop/kill tests with the patterns introduced in
7a49e9b68e.  This implementation should be
more resilient to discrepancy between task stopping and task being marked as exited.
2018-12-11 09:35:26 -05:00
Mahmood Ali
da070a58b7 test: fix TestFingerprintManager_Run_Combination
Let's use a fingerprinter that doesn't have values prepopulated in test
fixtures.
2018-12-11 09:35:26 -05:00
Mahmood Ali
d6e708fe2d tests: setup libcontainer rootfs
Using statically linked busybox binary to setup a basic rootfs for
testing, by symlinking it to provide the basic commands used in tests.

I considered using a proper rootfs tarball, but the overhead of managing
tarfile and expanding it seems significant enough that I went with this
implementation.
2018-12-11 09:35:26 -05:00
Mahmood Ali
744aab5751 tests: Lower package runtime
Lowering the runtime here to pre 7ca535aa90 expectations.

The longest package at the time `client/driver` shrunk significantly,
and now the longest packages take less than 5 minutes.

We do have some long running timed out projects due to a stuck shutdown,
but in completed jobs (though they failed), the longest packages took
less than 5 minutes.  The longest running packages in
https://travis-ci.org/hashicorp/nomad/jobs/464640776 were:

```
FAIL  github.com/hashicorp/nomad/nomad                                   268.089s
ok    github.com/hashicorp/nomad/drivers/docker                          203.903s  coverage:  68.8%   of  statements
ok    github.com/hashicorp/nomad/drivers/rkt                             132.104s  coverage:  65.0%   of  statements
ok    github.com/hashicorp/nomad/api                                     123.193s  coverage:  62.9%   of  statements
ok    github.com/hashicorp/nomad/command/agent                           74.657s   coverage:  72.3%   of  statements
ok    github.com/hashicorp/nomad/command                                 63.592s   coverage:  42.7%   of  statements
```
2018-12-11 09:35:26 -05:00
Danielle Tomlinson
419743f165 allocrunner: Test alloc runners should include a noop migrator 2018-12-11 13:12:35 +01:00
Danielle Tomlinson
d9e9265e8a allocwatcher: Cleanup new migrator/watcher interface 2018-12-11 13:12:35 +01:00
Danielle Tomlinson
d44d4b57de client: Unify handling of previous and preempted allocs 2018-12-11 13:12:35 +01:00
Michael Schurter
5306b1e953 Merge pull request #4953 from hashicorp/b-script-context-wrapper
consul: add ScriptExecutor context wrapper
2018-12-10 17:22:53 -08:00
Michael Schurter
961f7fae15 Merge pull request #4952 from hashicorp/b-script-context
consul: fix script checks exiting after 1 run
2018-12-10 17:22:15 -08:00
Danielle Tomlinson
a4cf83d00c client: Wait for preempted allocs to terminate
When starting an allocation that is preempting other allocs, we create a
new group allocation watcher, and then wait for the allocations to
terminate in the allocation PreRun hooks.

If there's no preempted allocations, then we simply provide a
NoopAllocWatcher.
2018-12-11 00:59:18 +01:00
Danielle Tomlinson
c6d1981955 allocwatcher: Add Group AllocWatcher
The Group Alloc watcher is an implementation of a PrevAllocWatcher that
can wait for multiple previous allocs before terminating.

This is to be used when running an allocation that is preempting upstream
allocations, and thus only supports being ran with a local alloc watcher.

It also currently requires all of its child watchers to correctly handle
context cancellation. Should this be a problem, it should be fairly easy
to implement a replacement using channels rather than a waitgroup.

It obeys the PrevAllocWatcher interface for convenience, but it may be
better to extract Migration capabilities into a seperate interface for
greater clarity.
2018-12-11 00:58:27 +01:00
Alex Dadgar
aeb9995daa typo 2018-12-10 15:35:26 -08:00
Alex Dadgar
ff1f007f91 merge 087 and 090 changelog 2018-12-10 15:34:21 -08:00
Mahmood Ali
070ed8b654 fix dtestutil.NewDriverHarness ref 2018-12-08 09:58:23 -05:00
Mahmood Ali
2fb5e35012 Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill
executor: kill all container processes
2018-12-08 09:52:42 -05:00
Nick Ethier
3fe18c2f2d Merge pull request #4973 from emate/recover-filerotator-from-io-errors
Recover from any possible io error when invoking Write on FileRotator
2018-12-08 00:05:42 -05:00
Alex Dadgar
9a220c8d40 Merge pull request #4965 from hashicorp/b-gc-running
Don't GC running but desired stop allocations
2018-12-07 13:36:33 -08:00
Marcin Matlaszek
b91fa87d31 Recover from any possible io error when invoking Write on FileRotator
As of now, FileRotator uses bufio.Write under the hood to write data to
configured output file. Due to the way how bufio handles any occurred io
error - saves it into `err` variable never resetting it automatically -
any operation like `Write`, `Flush` etc will become a no-op, returning the very same,
saved error (eg. Out of disk space) even when the problem is fixed (eg. disk
space is available again).

That automatically means that FileRotator will stop writing any logs,
reporting the same error over and over again, even if it's no longer
valid.

This PR fixes it by resetting the bufio Writer, which resets any errors
and tries to write requested data.
2018-12-07 18:22:29 +01:00
Mahmood Ali
c74e2d0243 Merge pull request #4933 from hashicorp/f-mount-device
Mount Devices in container based drivers
2018-12-07 10:32:03 -05:00
Mahmood Ali
379e79ceff Vendor libcontainer/devices 2018-12-07 09:13:27 -05:00
Alex Dadgar
f555dc3f67 Warn if IOPS is being used 2018-12-06 16:17:09 -08:00
Alex Dadgar
0953d913ed Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Danielle Tomlinson
e668e55fa5 Merge pull request #4960 from hashicorp/dani/b-gc-tests
Re-enable Client GC tests
2018-12-06 23:18:36 +01:00
Mahmood Ali
aef1c9dc96 Merge pull request #4955 from hashicorp/fix-docker-tests-20181203
Fix docker driver tests
2018-12-06 16:41:33 -05:00
Danielle Tomlinson
f6e2687f5b gc: Fix maxallocs integration test 2018-12-06 21:50:50 +01:00
Mahmood Ali
b78130eaaf Use absolute path in example device plugin
deviceDir is used for specifying mount/device host paths, and those
should be absolute paths.
2018-12-06 15:46:35 -05:00
Mahmood Ali
8c5e2e39a4 driver/rkt: mount plugin devices 2018-12-06 15:46:35 -05:00
Mahmood Ali
bfa2854c8b driver/lxc: mount plugin devices
Also, LXC requires target paths to be relative.  Container paths in LXC
binds should never be absolute paths, so we strip any preceeding `/`,
even if a user sets one.
2018-12-06 15:46:35 -05:00
Mahmood Ali
a0a5847315 fixup: add missed docker utils test 2018-12-06 15:46:35 -05:00
Mahmood Ali
5fd8d3fe68 tests: ensure image is loaded as test setup 2018-12-06 15:36:43 -05:00
Michael Lange
12dc187779 Merge pull request #4967 from hashicorp/b-ui-stat-charts-can-escape-canvas
UI: Keep line charts in their canvases at all times
2018-12-06 10:56:37 -08:00
Danielle Tomlinson
f3c057d7e0 client/gc: Replace GC integration test with unit
The previous integration test was broken during the client refactor, and
it seems to be some sort of race with state updating.

I'm going to try and construct a replacement test as part of work on
performance, but for now, the underlying behaviour is still being
tested.
2018-12-06 12:28:23 +01:00