Commit Graph

140 Commits

Author SHA1 Message Date
Alex Dadgar
97e3603043 Always populate task dir environment variables
Fixes an issue where if a task was restarted after restating the client,
the task dir environment variables would not be populated. This PR fixes
this for both upgrades from 0.8.X and for normal 0.9 restarts.
2019-01-29 13:17:10 -08:00
Alex Dadgar
22a13de279 Fix env templates having interpolated destinations
Fixes an issue where env templates that had interpolated destinations
would not work.

Fixes https://github.com/hashicorp/nomad/issues/5250
2019-01-28 10:28:53 -08:00
Alex Dadgar
d28bf3231e Fix double restart counting for templates
This PR fixes an issue where template restarts would count twice since
it was emitting a restarting event.
2019-01-25 15:38:13 -08:00
Nick Ethier
410bb28420 Merge pull request #5227 from hashicorp/b-client-highcpu-usage
Fix bug related to high cpu usage
2019-01-23 14:27:51 -05:00
Michael Schurter
158c74887e goimports until make check is happy 2019-01-23 06:27:14 -08:00
Nick Ethier
f51b7f9942 tr: use context in as select statement 2019-01-22 20:11:39 -05:00
Michael Schurter
0d61ff0fb9 move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar
72f8851254 Split hclspec 2019-01-22 15:43:34 -08:00
Alex Dadgar
8be7057177 move hclutils 2019-01-22 15:43:34 -08:00
Alex Dadgar
e46d67a889 Driver tests do not use hcl2/hcl, hclspec, or hclutils 2019-01-22 15:43:34 -08:00
Michael Schurter
7c45733ba4 Merge pull request #5211 from hashicorp/test-porting-08
Port some 0.8 TaskRunner tests
2019-01-22 14:05:53 -08:00
Michael Schurter
06119e2505 test: port TestTaskRunner_CheckWatcher_Restart
Added ability to adjust the number of events the TaskRunner keeps as
there's no way to observe all events otherwise.

Task events differ slightly from 0.8 because 0.9 emits Terminated every
time a task exits instead of only when it exits on its own (not due to
restart or kill).

0.9 does not emit Killing/Killed for restarts like 0.8 which seems fine
as `Restart Signaled/Terminated/Restarting` is more descriptive.

Original v0.8 events emitted:
```
	expected := []string{
		"Received",
		"Task Setup",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Restarting",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Restarting",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Not Restarting",
	}
```
2019-01-22 09:46:46 -08:00
Michael Schurter
81334cd3a1 test: port RestartTask from 0.8 2019-01-22 08:08:08 -08:00
Michael Schurter
418d360d19 test: port SignalFailure test from 0.8
Also fix signal error handling in mock_driver.
2019-01-22 08:08:08 -08:00
Preetha Appan
a1cc48f78d Rename TaskKillRequest/Response to TaskPreKillRequest/Response 2019-01-22 09:54:02 -06:00
Preetha Appan
6f3e4e72b9 Fix log comments 2019-01-22 09:45:58 -06:00
Preetha Appan
c3f044291d Rename TaskKillHook to TaskPreKillHook to more closely match usage
Also added/fixed comments
2019-01-22 09:41:56 -06:00
Michael Schurter
77c01c6283 Fix comment
Co-Authored-By: preetapan <preetha@hashicorp.com>
2019-01-22 09:41:21 -06:00
Preetha Appan
bf9a2168e7 Rename TaskKillHook to TaskPreKillHook to more closely match usage
Also added/fixed comments
2019-01-22 09:41:21 -06:00
Mahmood Ali
9f7619344e Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Chris Baker
b43f803d36 set TaskGroupName in task_runner 2019-01-18 20:25:11 +00:00
Chris Baker
d2f3dd7530 documenting test for task runner failure to set TaskGroupName 2019-01-18 20:00:49 +00:00
Michael Schurter
4e4ecc949f Merge pull request #5203 from hashicorp/b-terminated
client: restore Terminated event on every exit
2019-01-18 08:54:15 -08:00
Preetha Appan
eb7663697b Fix one more place that should be using taskResources
taskResources handles new resource fields in a backwards compatible way
2019-01-17 15:52:51 -06:00
Michael Schurter
64e531e7bb client: restore Terminated event on every exit
v0.9.0-dev started emitting a Terminated event every time a task process
exited. While this wasn't true in previous versions, it's a useful task
event because it's the only place for job operators to view the task's
exit code.

This behavior is asserted in the e2e/taskevents tests.
2019-01-17 10:02:25 -08:00
Danielle Tomlinson
435a8bae96 Merge pull request #5193 from hashicorp/dani/logmon-reattach
logmon: Reattach to existing loggers
2019-01-16 17:34:13 +01:00
Danielle Tomlinson
828d5f5a53 logmon: Reattach to existing loggers
This commit prevents us from creating duplicate logmon hooks when
restoring allocations by persisting the logmon reattach config using
HookData.
2019-01-16 14:56:10 +01:00
Michael Schurter
152ba08207 test: porting TestTaskRunner_SimpleRun_Dispatch
Porting test from 0.8 to 0.9.
2019-01-15 15:22:13 -08:00
Michael Schurter
3a491ab2b4 Merge pull request #5187 from hashicorp/test-consul
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Mahmood Ali
b5c20aa50b Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Michael Schurter
b58a77ecc0 test: assert service interpolation behavior
Ported from pre-0.9 tests.
2019-01-14 09:56:53 -08:00
Michael Schurter
99a5586aed test: assert shutdown delay deregs first
Restore a pre-0.9 test that asserts Consul services are deregistered
before a task's shutdown delay.
2019-01-14 09:56:53 -08:00
Michael Schurter
903779769d Update client/allocrunner/taskrunner/stats_hook.go
Co-Authored-By: nickethier <ncethier@gmail.com>
2019-01-14 12:31:27 -05:00
Nick Ethier
091cdbcb12 tr: stop stats collection on Exited hook 2019-01-14 12:30:14 -05:00
Nick Ethier
fc84e8f2bc tr: add retry /w backoff to stats_hook failure 2019-01-12 12:18:24 -05:00
Nick Ethier
9904463da2 executor: fix failing stats related test 2019-01-12 12:18:23 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Alex Dadgar
c7fc39d38d Merge pull request #5168 from hashicorp/b-kill-race
Improve Kill handling on task runner
2019-01-09 12:05:10 -08:00
Alex Dadgar
d916c0dd12 add more comments 2019-01-09 12:04:22 -08:00
Michael Schurter
e44d51f4d0 Spelling fix
Co-Authored-By: dadgar <alex@hashicorp.com>
2019-01-09 11:42:40 -08:00
Mahmood Ali
d1fbd735f3 Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Alex Dadgar
a5ba15591a Improve Kill handling on task runner
This PR improves how killing a task is handled. Before the kill function
directly orchestrated the killing and was only valid while the task was
running. The new behavior is to mark the desired state and wait for the
task runner to converge to that state.
2019-01-08 16:42:26 -08:00
Michael Schurter
1ae8261139 client: emit Killing/Killed task events
We were just emitting Killed/Terminated events before. In v0.8 we
emitted Killing/Killed, but lacked Terminated when explicitly stopping
a task. This change makes it so Terminated is always included, whether
explicitly stopping a task or it exiting on its own.

New output:

2019-01-04T14:58:51-08:00  Killed            Task successfully killed
2019-01-04T14:58:51-08:00  Terminated        Exit Code: 130, Signal: 2
2019-01-04T14:58:51-08:00  Killing           Sent interrupt
2019-01-04T14:58:51-08:00  Leader Task Dead  Leader Task in Group dead
2019-01-04T14:58:49-08:00  Started           Task started by client
2019-01-04T14:58:49-08:00  Task Setup        Building Task Directory
2019-01-04T14:58:49-08:00  Received          Task received by client

Old (v0.8.6) output:

2019-01-04T22:14:54Z  Killed            Task successfully killed
2019-01-04T22:14:54Z  Killing           Sent interrupt. Waiting 5s before force killing
2019-01-04T22:14:54Z  Leader Task Dead  Leader Task in Group dead
2019-01-04T22:14:53Z  Started           Task started by client
2019-01-04T22:14:53Z  Task Setup        Building Task Directory
2019-01-04T22:14:53Z  Received          Task received by client
2019-01-08 07:20:54 -08:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Mahmood Ali
694e3010c2 use drivers.FSIsolation 2019-01-08 09:11:47 -05:00
Mahmood Ali
607e7f2dde remove always false parameter
Simplify allocDir.Build() function to avoid depending on client/structs,
and remove a parameter that's always set to `false`.

The motivation here is to avoid a dependency cycle between
drivers/cstructs and alloc_dir.
2019-01-08 09:11:47 -05:00
Alex Dadgar
6bb99c93d0 Review comments 2019-01-07 14:50:28 -08:00
Alex Dadgar
5424d3b540 vet 2019-01-07 14:49:41 -08:00
Alex Dadgar
19e67a0916 Test recovery 2019-01-07 14:49:41 -08:00
Alex Dadgar
144866a87b Mock driver has recovery, stats 2019-01-07 14:49:40 -08:00