nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-07 19:05:42 +03:00

Author	SHA1	Message	Date
Alex Dadgar	c7fc39d38d	Merge pull request #5168 from hashicorp/b-kill-race Improve Kill handling on task runner	2019-01-09 12:05:10 -08:00
Alex Dadgar	d916c0dd12	add more comments	2019-01-09 12:04:22 -08:00
Michael Schurter	e44d51f4d0	Spelling fix Co-Authored-By: dadgar <alex@hashicorp.com>	2019-01-09 11:42:40 -08:00
Mahmood Ali	d1fbd735f3	Merge pull request #5157 from hashicorp/r-drivers-no-cstructs drivers: avoid referencing client/structs package	2019-01-09 13:06:46 -05:00
Alex Dadgar	a5ba15591a	Improve Kill handling on task runner This PR improves how killing a task is handled. Before the kill function directly orchestrated the killing and was only valid while the task was running. The new behavior is to mark the desired state and wait for the task runner to converge to that state.	2019-01-08 16:42:26 -08:00
Michael Schurter	1ae8261139	client: emit Killing/Killed task events We were just emitting Killed/Terminated events before. In v0.8 we emitted Killing/Killed, but lacked Terminated when explicitly stopping a task. This change makes it so Terminated is always included, whether explicitly stopping a task or it exiting on its own. New output: 2019-01-04T14:58:51-08:00 Killed Task successfully killed 2019-01-04T14:58:51-08:00 Terminated Exit Code: 130, Signal: 2 2019-01-04T14:58:51-08:00 Killing Sent interrupt 2019-01-04T14:58:51-08:00 Leader Task Dead Leader Task in Group dead 2019-01-04T14:58:49-08:00 Started Task started by client 2019-01-04T14:58:49-08:00 Task Setup Building Task Directory 2019-01-04T14:58:49-08:00 Received Task received by client Old (v0.8.6) output: 2019-01-04T22:14:54Z Killed Task successfully killed 2019-01-04T22:14:54Z Killing Sent interrupt. Waiting 5s before force killing 2019-01-04T22:14:54Z Leader Task Dead Leader Task in Group dead 2019-01-04T22:14:53Z Started Task started by client 2019-01-04T22:14:53Z Task Setup Building Task Directory 2019-01-04T22:14:53Z Received Task received by client	2019-01-08 07:20:54 -08:00
Mahmood Ali	c0162fab35	move cstructs.DeviceNetwork to drivers pkg	2019-01-08 09:11:47 -05:00
Mahmood Ali	694e3010c2	use drivers.FSIsolation	2019-01-08 09:11:47 -05:00
Mahmood Ali	607e7f2dde	remove always false parameter Simplify allocDir.Build() function to avoid depending on client/structs, and remove a parameter that's always set to `false`. The motivation here is to avoid a dependency cycle between drivers/cstructs and alloc_dir.	2019-01-08 09:11:47 -05:00
Alex Dadgar	6bb99c93d0	Review comments	2019-01-07 14:50:28 -08:00
Alex Dadgar	5424d3b540	vet	2019-01-07 14:49:41 -08:00
Alex Dadgar	19e67a0916	Test recovery	2019-01-07 14:49:41 -08:00
Alex Dadgar	144866a87b	Mock driver has recovery, stats	2019-01-07 14:49:40 -08:00
Alex Dadgar	b300306c4a	comments	2019-01-07 14:49:40 -08:00
Alex Dadgar	3257eb6d86	Fix hooks	2019-01-07 14:49:40 -08:00
Alex Dadgar	437f03d877	recover	2019-01-07 14:49:40 -08:00
Mahmood Ali	17b7490891	taskrunner: emit TaskReceived event Preserve pre-0.9, where task runner emits `Received: Task received by client` event on task runner creation.	2019-01-04 14:32:29 -05:00
Danielle Tomlinson	54dde24bcb	taskrunner: Persist environment from hooks https://github.com/hashicorp/nomad/pull/5032 introduced a regression where the origHookState was used in place of the response from the hook.	2019-01-03 13:13:57 +01:00
Alex Dadgar	99df4c98c7	Store device envs separately and pass to drivers	2018-12-19 14:23:09 -08:00
Michael Schurter	784706a1e5	client/state: support upgrading from 0.8->0.9 Also persist and load DeploymentStatus to avoid rechecking health after client restarts.	2018-12-19 10:39:27 -08:00
Michael Schurter	3a41fd7b31	tr: fix HookState Copy() and Equal() methods They did not take into account the Env field.	2018-12-19 09:58:06 -08:00
Nick Ethier	6951ca487d	drivermanager: use allocID and task name to route task events	2018-12-18 23:01:51 -05:00
Nick Ethier	331793e283	client: batch initial fingerprinting in plugin manangers drivermanager: fix pr comments/feedback	2018-12-18 22:56:19 -05:00
Nick Ethier	2f010a2f25	client/drivermananger: fixup issues from rebase and address PR comments	2018-12-18 22:55:38 -05:00
Nick Ethier	32aaedd6b7	tr: deregister task handler on cleanup	2018-12-18 22:55:38 -05:00
Nick Ethier	39ca1b00dd	client/drivermananger: add driver manager The driver manager is modeled after the device manager and is started by the client. It's responsible for handling driver lifecycle and reattachment state, as well as processing the incomming fingerprint and task events from each driver. The mananger exposes a method for registering event handlers for task events that is used by the task runner to update the server when a task has been updated with an event. Since driver fingerprinting has been implemented by the driver manager, it is no longer needed in the fingerprint mananger and has been removed.	2018-12-18 22:55:18 -05:00
Alex Dadgar	517bf1c35f	Fix unit tests + upgrade pathing resources	2018-12-18 15:50:44 -08:00
Alex Dadgar	d5512c39f0	Lint	2018-12-18 15:50:44 -08:00
Alex Dadgar	7a0b73341a	LinuxResources doesn't use task.Resources	2018-12-18 15:50:44 -08:00
Alex Dadgar	cd6879409c	Drivers	2018-12-18 15:50:11 -08:00
Alex Dadgar	da6925bfc1	utilities	2018-12-18 15:48:52 -08:00
Danielle Tomlinson	b92bc1178d	taskrunner: Use a random suffix for Task Config The RestartCount is not really suitable for use as a source of uniqueness within task invocations as it is not monotonic, and interacts with the restart stanza in a users config, so conflates restarts due to task failures, with restarts due to enviromental changes, such as consul template or vault secrets changing. Here we instead use a substring from a uuid, which is more random than we strictly need, but is nicer than rolling our own random string generator here.	2018-12-19 00:38:54 +01:00
Danielle Tomlinson	61a17621e3	taskrunner: Use hook errors for artifacts	2018-12-17 10:39:38 +01:00
Danielle Tomlinson	4d4201331c	taskrunner: Emit task events when a hook fails	2018-12-13 18:20:18 +01:00
Alex Dadgar	8b624340ad	Fix various bugs with task events Fixes the following: * Emitting events when the task fails to start * Don't double emit events on task shutdown (nomad stop) * Don't emit a OOM kill metric unless actually OOM'd	2018-12-05 14:27:07 -08:00
Danielle Tomlinson	03db4cf82d	client: Rename drivers/shared/env => client/taskenv	2018-11-30 12:18:39 +01:00
Danielle Tomlinson	756325bcbd	client: Merge driver/shared/structs and client/structs	2018-11-30 10:56:45 +01:00
Danielle Tomlinson	d4ef3b68d9	driver: Flatten SetEnvvars into taskdirhook	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	bacd6175f5	client: Migrate DriverStats optout to drivers/shared/structs	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	6756ffd052	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Michael Schurter	b959d9831c	add nil check around task resources in device hook Looking at NewTaskRunner I'm unsure whether TaskRunner.TaskResources (from which req.TaskResources is set) is intended to be nil at times or if the TODO in NewTaskRunner is intended to ensure it is always non-nil.	2018-11-27 17:25:33 -08:00
Michael Schurter	c7b4ee1546	assume that slices contain only non-nil items	2018-11-27 17:25:33 -08:00
Michael Schurter	a13607f2d9	client: properly support hook env vars The old approach was incomplete. Hook env vars are now: * persisted and restored between agent restarts * deterministic (LWW if 2 hooks set the same key)	2018-11-27 17:25:33 -08:00
Alex Dadgar	429c5bb885	Device hook and devices affect computed node class This PR introduces a device hook that retrieves the device mount information for an allocation. It also updates the computed node class computation to take into account devices. TODO Fix the task runner unit test. The environment variable is being lost even though it is being properly set in the prestart hook.	2018-11-27 17:25:33 -08:00
Chris Baker	e3108507c5	Merge pull request #4891 from hashicorp/b-1150-rkt-volume-names drivers/rkt: fix invalid volumes	2018-11-27 18:55:00 -05:00
Danielle Tomlinson	d361015562	Merge pull request #4909 from hashicorp/b-restart-delay taskrunner: Return the restart delay correctly	2018-11-27 23:55:54 +01:00
Michael Schurter	5d6d4bf290	Merge pull request #4883 from hashicorp/f-graceful-shutdown Support graceful shutdowns in agent	2018-11-27 15:55:15 -06:00
Michael Schurter	021c0cc4bf	client: document how AR/TR Run methods behave	2018-11-26 12:50:35 -08:00
Chris Baker	790fe0b1db	modified TaskConfig to include AllocID use this for volume names in drivers/rkt to address #1150	2018-11-26 18:54:26 +00:00
Danielle Tomlinson	d73c2f3c8d	taskrunner: Return the restart delay correctly We were incorrectly returning a 0 duration to the taskrunner when determining when a task should restart. This would cause tasks to be restarted immediately, ignoring the restart {} stanza in a users configuration. This commit causes us to return the restart duration to the task runner so it may correctly delay further execution.	2018-11-20 21:52:23 +01:00

1 2 3

103 Commits