Commit Graph

217 Commits

Author SHA1 Message Date
Michael Schurter
e9e1d788b5 client: cleanup and document context uses
Some of the context uses in TR hooks are useless (Killed during Stop
never seems meaningful).

None of the hooks are interruptable for graceful shutdown which is
unfortunate and probably needs fixing.
2019-03-12 15:03:54 -07:00
Michael Schurter
7970b224b0 client: emit event and call exited hooks during cleanup
Builds upon earlier commit that cleans up restored handles of terminal
allocs by also emitting terminated events and calling exited hooks when
appropriate.
2019-03-05 15:12:02 -08:00
Michael Schurter
54177ad672 logmon: drop reattach log level as its expected
Logged once per terminal task on agent restart.
2019-03-04 13:26:01 -08:00
Michael Schurter
8d409a6e39 client: test logmon cleanup
The test is sadly quite complicated and peeks into things (logmon's
reattach config) AR doesn't normally have access to.

However, I couldn't find another way of asserting logmon got cleaned up
without resorting to smaller unit tests. Smaller unit tests risk
re-implementing dependencies in an unrealistic way, so I opted for an
ugly integration test.
2019-03-04 13:15:15 -08:00
Michael Schurter
db9daf6631 client: ensure task is cleaned up when terminal
This commit is a significant change. TR.Run is now always executed, even
for terminal allocations. This was changed to allow TR.Run to cleanup
(run stop hooks) if a handle was recovered.

This is intended to handle the case of Nomad receiving a
DesiredStatus=Stop allocation update, persisting it, but crashing before
stopping AR/TR.

The commit also renames task runner hook data as it was very easy to
accidently set state on Requests instead of Responses using the old
field names.
2019-03-01 14:00:23 -08:00
Michael Schurter
d5a14d5606 Merge pull request #5352 from hashicorp/b-leaked-logmon
logmon fixes
2019-02-26 08:35:46 -08:00
Michael Schurter
432be05682 tests: move unix-specific test to its own file
Other logmon tests should be portable.
2019-02-26 07:56:44 -08:00
Michael Schurter
05bae8d149 client: restart task on logmon failures
This code chooses to be conservative as opposed to optimal: when failing
to reattach to logmon simply return a recoverable error instead of
immediately trying to restart logmon.

The recoverable error will cause the task's restart policy to be
applied and a new logmon will be launched upon restart.

Trying to do the optimal approach of simply starting a new logmon
requires error string comparison and should be tested against a task
actively logging to assert the behavior (are writes blocked? dropped?).
2019-02-25 15:42:45 -08:00
Michael Schurter
cf98f8f70a client: test logmon_hook 2019-02-23 15:36:48 -08:00
Preetha Appan
ad58ba3e18 More alloc runner tests ported from 0.8.7 2019-02-22 17:58:06 -06:00
Mahmood Ali
e1e5053936 emit TaskRestartSignal event on vault restart
When Vault token expires and task is restarted, emit `TaskRestartSignal`
similar to v0.8.7
2019-02-22 15:56:14 -05:00
Mahmood Ali
8b7f66499f address review comments 2019-02-22 15:56:14 -05:00
Mahmood Ali
d80774fde0 tests: port TestTaskRunner_VaultManager_Signal
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1427
2019-02-22 15:53:04 -05:00
Mahmood Ali
b8c74ff6ca tests: port TestTaskRunner_VaultManager_Restart
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1352
2019-02-22 15:53:04 -05:00
Mahmood Ali
ce04bb7440 tests: port TestTaskRunner_UnregisterConsul_Retries
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L620
2019-02-22 15:53:04 -05:00
Mahmood Ali
d6a5a1c5a5 tests: port TestTaskRunner_Template_NewVaultToken
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1275
2019-02-22 15:53:04 -05:00
Mahmood Ali
90ca1ab5a3 tests: port TestTaskRunner_Template_Artifact
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1195
2019-02-22 15:52:59 -05:00
Mahmood Ali
4c30b03879 tests: port TestAllocRunner_RetryArtifact
Port TestAllocRunner_RetryArtifact from https://github.com/hashicorp/nomad/blob/v0.8.7/client/alloc_runner_test.go#L610-L672

I changed the test name because it doesn't actually test that artifact
hooks is retried
2019-02-22 15:50:39 -05:00
Mahmood Ali
69906bade4 tests: port TestAllocRunner_MoveAllocDir test 2019-02-22 15:50:39 -05:00
Michael Schurter
55cbbded6c logmon: fix reattach configuration
There were multiple bugs here:

1. Reattach unmarshalling always returned an error because you can't
   unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
   struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
   set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
   reattaching was never attempted.
2019-02-21 15:32:18 -08:00
Michael Schurter
cf66e25e57 client: restart on recoverable StartTask errors
Fixes restarting on recoverable errors from StartTask.

Ports TestTaskRunner_Run_RecoverableStartError from 0.8 which discovered
the bug.
2019-02-21 15:30:49 -08:00
Michael Schurter
414532adab test: port TestTaskRunner_RestartSignalTask_NotRunning from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter
d4a17ae71f test: port TestTaskRunner_DriverNetwork from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter
234f644b35 Merge pull request #5322 from hashicorp/b-artifact-retries
Fix regression by restarting on artifact download errors
2019-02-21 15:28:51 -08:00
Michael Schurter
159266ccec tests: port TestAllocRunner_Destroy from 0.8
Also add destroy(ar) helper to fix a bunch of shutdown races in AR
tests.
2019-02-20 12:35:09 -08:00
Michael Schurter
1acd4ca744 client: don't redownload completed artifacts on retries
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
2019-02-20 08:45:12 -08:00
Michael Schurter
c51a54cfee client: artifact errors are retry-able
0.9.0beta2 contains a regression where artifact download errors would
not cause a task restart and instead immediately fail the task.

This restores the pre-0.9 behavior of retrying all artifact errors and
adds missing tests.
2019-02-20 07:21:27 -08:00
Michael Schurter
83979252cd tests: add new task runner test helper
Adds a new helper and removes a duplicated test.
2019-02-20 07:21:27 -08:00
Michael Schurter
7b8ec414a3 client: fix setting alloc unhealthy at deadline
During the 0.9 client refactor the code to fail a deployment when the
deadline was reached was broken. This restores and tests that behavior.
2019-02-19 07:44:14 -08:00
Mahmood Ali
eb8b19ec82 test: improve readability of duration
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:12:06 -08:00
Mahmood Ali
a96cc97389 test: improve failure message
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:11:37 -08:00
Michael Schurter
fa9537f6e9 tests: port TestTaskRunner_Download_List from 0.8 2019-02-12 15:48:04 -08:00
Michael Schurter
cfbe7520e8 consul: fix task deregistration hook
Broke ShutdownDelay but the test was timing dependent so it just
appeared flaky. Made the test slower so that it should never incorrectly
pass.
2019-02-12 15:36:02 -08:00
Michael Schurter
f2506e4d29 tests: port TaskRunner_DeriveToken tests from 0.8 2019-02-12 15:36:02 -08:00
Michael Schurter
b41308f16a tests: port TestTaskRunner_BlockForVault from 0.8
Also fix race conditions in the mock vault client.
2019-02-12 13:46:09 -08:00
Michael Schurter
1d17fbc681 simplify hcl2 parsing helper
No need to pass in the entire eval context
2019-02-04 11:07:57 -08:00
Alex Dadgar
fe6d7df6c4 Fix usage of fsi variable 2019-01-29 14:07:55 -08:00
Alex Dadgar
97e3603043 Always populate task dir environment variables
Fixes an issue where if a task was restarted after restating the client,
the task dir environment variables would not be populated. This PR fixes
this for both upgrades from 0.8.X and for normal 0.9 restarts.
2019-01-29 13:17:10 -08:00
Alex Dadgar
22a13de279 Fix env templates having interpolated destinations
Fixes an issue where env templates that had interpolated destinations
would not work.

Fixes https://github.com/hashicorp/nomad/issues/5250
2019-01-28 10:28:53 -08:00
Alex Dadgar
d28bf3231e Fix double restart counting for templates
This PR fixes an issue where template restarts would count twice since
it was emitting a restarting event.
2019-01-25 15:38:13 -08:00
Nick Ethier
410bb28420 Merge pull request #5227 from hashicorp/b-client-highcpu-usage
Fix bug related to high cpu usage
2019-01-23 14:27:51 -05:00
Michael Schurter
87f8dff1ef Merge pull request #5196 from hashicorp/f-plugin-utils
Make plugins/shared external and make pluginutls/
2019-01-23 06:59:32 -08:00
Preetha
0723d2c306 Merge pull request #5225 from hashicorp/b-notaskevent-terminalallocs
Don't emit task events after alloc is in a terminal DesiredState
2019-01-23 08:54:10 -06:00
Michael Schurter
158c74887e goimports until make check is happy 2019-01-23 06:27:14 -08:00
Nick Ethier
f51b7f9942 tr: use context in as select statement 2019-01-22 20:11:39 -05:00
Michael Schurter
0d61ff0fb9 move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar
72f8851254 Split hclspec 2019-01-22 15:43:34 -08:00
Alex Dadgar
8be7057177 move hclutils 2019-01-22 15:43:34 -08:00
Alex Dadgar
e46d67a889 Driver tests do not use hcl2/hcl, hclspec, or hclutils 2019-01-22 15:43:34 -08:00
Preetha Appan
26cb3cd12d Use DesiredState to determine whether to stop sending task events 2019-01-22 16:43:32 -06:00