Commit Graph

3635 Commits

Author SHA1 Message Date
Mahmood Ali
eb5ab38ae5 Regenerate Proto files (#5421)
Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin.

The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)), and [0.9.0-beta3 preperation](b849d84f2f).

This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync.  Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085
2019-03-14 10:56:27 -04:00
Michael Schurter
4bacdcd403 Merge pull request #5386 from hashicorp/b-logmon-stop
Fix task/logmon leak after crash
2019-03-12 15:23:02 -07:00
Michael Schurter
e9e1d788b5 client: cleanup and document context uses
Some of the context uses in TR hooks are useless (Killed during Stop
never seems meaningful).

None of the hooks are interruptable for graceful shutdown which is
unfortunate and probably needs fixing.
2019-03-12 15:03:54 -07:00
Mahmood Ali
086677b0df run TestAllocations_Stats in CI 2019-03-08 07:57:37 -05:00
Michael Schurter
7970b224b0 client: emit event and call exited hooks during cleanup
Builds upon earlier commit that cleans up restored handles of terminal
allocs by also emitting terminated events and calling exited hooks when
appropriate.
2019-03-05 15:12:02 -08:00
Michael Schurter
b5b2718e14 test: fix NewMemDB API change 2019-03-04 13:37:20 -08:00
Michael Schurter
54177ad672 logmon: drop reattach log level as its expected
Logged once per terminal task on agent restart.
2019-03-04 13:26:01 -08:00
Michael Schurter
8d409a6e39 client: test logmon cleanup
The test is sadly quite complicated and peeks into things (logmon's
reattach config) AR doesn't normally have access to.

However, I couldn't find another way of asserting logmon got cleaned up
without resorting to smaller unit tests. Smaller unit tests risk
re-implementing dependencies in an unrealistic way, so I opted for an
ugly integration test.
2019-03-04 13:15:15 -08:00
Preetha Appan
8a24660941 s/mananger/manager 2019-03-04 12:25:54 -06:00
Michael Schurter
db9daf6631 client: ensure task is cleaned up when terminal
This commit is a significant change. TR.Run is now always executed, even
for terminal allocations. This was changed to allow TR.Run to cleanup
(run stop hooks) if a handle was recovered.

This is intended to handle the case of Nomad receiving a
DesiredStatus=Stop allocation update, persisting it, but crashing before
stopping AR/TR.

The commit also renames task runner hook data as it was very easy to
accidently set state on Requests instead of Responses using the old
field names.
2019-03-01 14:00:23 -08:00
Michael Schurter
8ea6718ceb Remove generated files for 0.9.0-beta3 2019-02-26 10:34:08 -08:00
Michael Schurter
b849d84f2f Generate files for 0.9.0-beta3 release 2019-02-26 09:44:49 -08:00
Michael Schurter
d5a14d5606 Merge pull request #5352 from hashicorp/b-leaked-logmon
logmon fixes
2019-02-26 08:35:46 -08:00
Michael Schurter
432be05682 tests: move unix-specific test to its own file
Other logmon tests should be portable.
2019-02-26 07:56:44 -08:00
Mahmood Ali
8eb1c6fa29 tests: port some fingerprint tests from 0.8 (#5359)
Port some integration tests of driver fingerprinting.

Some tests (e.g. `TestFingerprintManager_Run_DriversInBlacklist`) have
been subsituted by more isolated tests in
`client/pluginmanager/drivermanager/manager_test.go`
2019-02-26 10:54:16 -05:00
Michael Schurter
05bae8d149 client: restart task on logmon failures
This code chooses to be conservative as opposed to optimal: when failing
to reattach to logmon simply return a recoverable error instead of
immediately trying to restart logmon.

The recoverable error will cause the task's restart policy to be
applied and a new logmon will be launched upon restart.

Trying to do the optimal approach of simply starting a new logmon
requires error string comparison and should be tested against a task
actively logging to assert the behavior (are writes blocked? dropped?).
2019-02-25 15:42:45 -08:00
Michael Schurter
cf98f8f70a client: test logmon_hook 2019-02-23 15:36:48 -08:00
Preetha Appan
ad58ba3e18 More alloc runner tests ported from 0.8.7 2019-02-22 17:58:06 -06:00
Mahmood Ali
e1e5053936 emit TaskRestartSignal event on vault restart
When Vault token expires and task is restarted, emit `TaskRestartSignal`
similar to v0.8.7
2019-02-22 15:56:14 -05:00
Mahmood Ali
8b7f66499f address review comments 2019-02-22 15:56:14 -05:00
Mahmood Ali
d80774fde0 tests: port TestTaskRunner_VaultManager_Signal
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1427
2019-02-22 15:53:04 -05:00
Mahmood Ali
b8c74ff6ca tests: port TestTaskRunner_VaultManager_Restart
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1352
2019-02-22 15:53:04 -05:00
Mahmood Ali
ce04bb7440 tests: port TestTaskRunner_UnregisterConsul_Retries
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L620
2019-02-22 15:53:04 -05:00
Mahmood Ali
d6a5a1c5a5 tests: port TestTaskRunner_Template_NewVaultToken
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1275
2019-02-22 15:53:04 -05:00
Mahmood Ali
90ca1ab5a3 tests: port TestTaskRunner_Template_Artifact
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1195
2019-02-22 15:52:59 -05:00
Mahmood Ali
4c30b03879 tests: port TestAllocRunner_RetryArtifact
Port TestAllocRunner_RetryArtifact from https://github.com/hashicorp/nomad/blob/v0.8.7/client/alloc_runner_test.go#L610-L672

I changed the test name because it doesn't actually test that artifact
hooks is retried
2019-02-22 15:50:39 -05:00
Mahmood Ali
69906bade4 tests: port TestAllocRunner_MoveAllocDir test 2019-02-22 15:50:39 -05:00
Michael Schurter
55cbbded6c logmon: fix reattach configuration
There were multiple bugs here:

1. Reattach unmarshalling always returned an error because you can't
   unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
   struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
   set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
   reattaching was never attempted.
2019-02-21 15:32:18 -08:00
Michael Schurter
4119767be8 fingerprint: improve initial fingerpint message
The initial fingerprint message is actually fairly useful, so I bumped
it to Debug and fixed the output formatting.
2019-02-21 15:32:18 -08:00
Michael Schurter
cf66e25e57 client: restart on recoverable StartTask errors
Fixes restarting on recoverable errors from StartTask.

Ports TestTaskRunner_Run_RecoverableStartError from 0.8 which discovered
the bug.
2019-02-21 15:30:49 -08:00
Michael Schurter
414532adab test: port TestTaskRunner_RestartSignalTask_NotRunning from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter
d4a17ae71f test: port TestTaskRunner_DriverNetwork from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter
234f644b35 Merge pull request #5322 from hashicorp/b-artifact-retries
Fix regression by restarting on artifact download errors
2019-02-21 15:28:51 -08:00
Mahmood Ali
55ee03a2ad Merge pull request #5341 from hashicorp/ci-windows-docker
Run Docker tests in Windows AppVeyor CI
2019-02-21 13:17:33 -05:00
Michael Schurter
159266ccec tests: port TestAllocRunner_Destroy from 0.8
Also add destroy(ar) helper to fix a bunch of shutdown races in AR
tests.
2019-02-20 12:35:09 -08:00
Michael Schurter
1acd4ca744 client: don't redownload completed artifacts on retries
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
2019-02-20 08:45:12 -08:00
Michael Schurter
c51a54cfee client: artifact errors are retry-able
0.9.0beta2 contains a regression where artifact download errors would
not cause a task restart and instead immediately fail the task.

This restores the pre-0.9 behavior of retrying all artifact errors and
adds missing tests.
2019-02-20 07:21:27 -08:00
Michael Schurter
83979252cd tests: add new task runner test helper
Adds a new helper and removes a duplicated test.
2019-02-20 07:21:27 -08:00
Mahmood Ali
2af30fb441 tests: expect Docker on AppVeyor
Prepare to run docker on AppVeyor Windows environment
2019-02-20 07:41:47 -05:00
Michael Schurter
7b8ec414a3 client: fix setting alloc unhealthy at deadline
During the 0.9 client refactor the code to fail a deployment when the
deadline was reached was broken. This restores and tests that behavior.
2019-02-19 07:44:14 -08:00
Mahmood Ali
eb8b19ec82 test: improve readability of duration
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:12:06 -08:00
Mahmood Ali
a96cc97389 test: improve failure message
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:11:37 -08:00
Michael Schurter
fa9537f6e9 tests: port TestTaskRunner_Download_List from 0.8 2019-02-12 15:48:04 -08:00
Michael Schurter
cfbe7520e8 consul: fix task deregistration hook
Broke ShutdownDelay but the test was timing dependent so it just
appeared flaky. Made the test slower so that it should never incorrectly
pass.
2019-02-12 15:36:02 -08:00
Michael Schurter
f2506e4d29 tests: port TaskRunner_DeriveToken tests from 0.8 2019-02-12 15:36:02 -08:00
Michael Schurter
b41308f16a tests: port TestTaskRunner_BlockForVault from 0.8
Also fix race conditions in the mock vault client.
2019-02-12 13:46:09 -08:00
Michael Schurter
1d17fbc681 simplify hcl2 parsing helper
No need to pass in the entire eval context
2019-02-04 11:07:57 -08:00
Michael Schurter
645c8c41ea client: log when allocs have been processed
Will hopefully help us catch deadlocks/livelocks/slowdowns in the
add/remove allocs pipeline which should be fast.
2019-02-04 11:07:57 -08:00
Michael Schurter
ff63b1aefe Remove 0.9.0-beta2 generated files 2019-02-01 08:28:44 -08:00
Alex Dadgar
29a5d242e6 Generate files for 0.9.0-beta2 2019-01-30 13:31:50 -08:00