nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 02:15:43 +03:00

Author	SHA1	Message	Date
Lang Martin	ee8496a88e	client structs: use nstructs rather than s for nomad/structs	2020-03-23 13:58:29 -04:00
Lang Martin	c37621cc98	client structs: move CSIVolumeAttachmentMode and CSIVolumeAccessMode	2020-03-23 13:58:29 -04:00
Danielle Lancashire	cd0c2a6df0	csi: Setup gRPC Clients with a logger	2020-03-23 13:58:29 -04:00
Danielle Lancashire	3f36dae246	csimanager: Fingerprint Node Service capabilities	2020-03-23 13:58:29 -04:00
Danielle Lancashire	406984ca8d	csimanager: Fingerprint controller capabilities	2020-03-23 13:58:29 -04:00
Danielle Lancashire	a7f7114590	client_csi: Validate Access/Attachment modes	2020-03-23 13:58:28 -04:00
Danielle Lancashire	19d06d5bb2	csi: ClientCSIControllerPublish* -> ClientCSIControllerAttach*	2020-03-23 13:58:28 -04:00
Danielle Lancashire	964ede4301	csi: Model Attachment and Access modes	2020-03-23 13:58:28 -04:00
Danielle Lancashire	778a32de4a	client: Setup CSI RPC Endpoint This commit introduces a new set of endpoints to a Nomad Client: ClientCSI. ClientCSI is responsible for mediating requests from a Nomad Server to a CSI Plugin running on a Nomad Client. It should only really be used to make controller RPCs.	2020-03-23 13:58:28 -04:00
Danielle Lancashire	d296efd2c6	CSI Plugin Registration (#6555 ) This changeset implements the initial registration and fingerprinting of CSI Plugins as part of #5378. At a high level, it introduces the following: * A `csi_plugin` stanza as part of a Nomad task configuration, to allow a task to expose that it is a plugin. * A new task runner hook: `csi_plugin_supervisor`. This hook does two things. When the `csi_plugin` stanza is detected, it will automatically configure the plugin task to receive bidirectional mounts to the CSI intermediary directory. At runtime, it will then perform an initial heartbeat of the plugin and handle submitting it to the new `dynamicplugins.Registry` for further use by the client, and then run a lightweight heartbeat loop that will emit task events when health changes. * The `dynamicplugins.Registry` for handling plugins that run as Nomad tasks, in contrast to the existing catalog that requires `go-plugin` type plugins and to know the plugin configuration in advance. * The `csimanager` which fingerprints CSI plugins, in a similar way to `drivermanager` and `devicemanager`. It currently only fingerprints the NodeID from the plugin, and assumes that all plugins are monolithic. Missing features * We do not use the live updates of the `dynamicplugin` registry in the `csimanager` yet. * We do not deregister the plugins from the client when they shutdown yet, they just become indefinitely marked as unhealthy. This is deliberate until we figure out how we should manage deploying new versions of plugins/transitioning them.	2020-03-23 13:58:28 -04:00
Drew Bailey	ae5777c4ea	Audit config, seams for enterprise audit features allow oss to parse sink duration clean up audit sink parsing ent eventer config reload fix typo SetEnabled to eventer interface client acl test rm dead code fix failing test	2020-03-23 13:47:42 -04:00
Mahmood Ali	525623c53c	health tracker: account for group service checks	2020-03-22 12:38:37 -04:00
Mahmood Ali	1454af731d	health check account for task lifecycle In service jobs, lifecycles non-sidecar task tweak health logic a bit: they may terminate successfully without impacting alloc health, but fail the alloc if they fail. Sidecars should be treated just like a normal task.	2020-03-22 12:37:40 -04:00
Mahmood Ali	3132176acd	health: fail health if any task is pending Fixes a bug where an allocation is considered healthy if some of the tasks are being restarted and as such, their checks aren't tracked by consul agent client. Here, we fix the immediate case by ensuring that an alloc is healthy only if tasks are running and the registered checks at the time are healthy. Previously, health tracker tracked task "health" independently from checks and leads to problems when a task restarts. Consider the following series of events: 1. all tasks start running -> `tracker.tasksHealthy` is true 2. one task has unhealthy checks and get restarted 3. remaining checks are healthy -> `tracker.checksHealthy` is true 4. propagate health status now that `tracker.tasksHealthy` and `tracker.checksHealthy`. This change ensures that we accurately use the latest status of tasks and checks regardless of their status changes. Also, ensures that we only consider check health after tasks are considered healthy, otherwise we risk trusting incomplete checks. This approach accomodates task dependencies well. Service jobs can have prestart short-lived tasks that will terminate before main process runs. These dead tasks that complete successfully will not negate health status.	2020-03-22 11:13:41 -04:00
Mahmood Ali	3719ff3059	tests: add a check for failing service checks Add tests to check for failing or missing service checks in consul update.	2020-03-22 11:13:40 -04:00
Mahmood Ali	2ad338ef38	address review feedback	2020-03-21 17:52:58 -04:00
Mahmood Ali	83b08ab158	tr: proceed to mark other tasks as dead if alloc fails	2020-03-21 17:52:58 -04:00
Mahmood Ali	4558fa6aec	fix test	2020-03-21 17:52:57 -04:00
Jasmine Dahilig	6c1474398f	change jobspec lifecycle stanza to use sidecar attribute instead of block_until status	2020-03-21 17:52:57 -04:00
Jasmine Dahilig	dcd317745d	fix restart policy for system jobs with no lifecycle	2020-03-21 17:52:56 -04:00
Jasmine Dahilig	3688a2b7a3	refactor TaskHookCoordinator tests to use mock package and add failed init and sidecar test cases	2020-03-21 17:52:56 -04:00
Jasmine Dahilig	db7e8614f3	remove debugging test code from TestAllocRunner_TaskLeader_StopRestoredTG	2020-03-21 17:52:54 -04:00
Jasmine Dahilig	60671f880d	fix bug in lifecycle restore tests after refactor	2020-03-21 17:52:54 -04:00
Jasmine Dahilig	90fa242d83	fix failing ci test: TestTaskRunner_UnregisterConsul_Retries	2020-03-21 17:52:54 -04:00
Jasmine Dahilig	da3eb69a2f	fix linting errors	2020-03-21 17:52:53 -04:00
Jasmine Dahilig	ee92c98d4e	add task hook coordinator many init tasks test case	2020-03-21 17:52:53 -04:00
Jasmine Dahilig	88d3e232a2	refactor task hook coordinator helper method and tests	2020-03-21 17:52:53 -04:00
Jasmine Dahilig	0031b6777f	clean up restore test	2020-03-21 17:52:52 -04:00
Jasmine Dahilig	aced15ea27	partial test for restore functionality	2020-03-21 17:52:52 -04:00
Jasmine Dahilig	48ce093dd5	account for client restarts in task lifecycle hooks	2020-03-21 17:52:51 -04:00
Jasmine Dahilig	c6cd7b523b	clean up restart conditions and restart tests for task lifecycle	2020-03-21 17:52:50 -04:00
Jasmine Dahilig	4be7d056ac	put lifecycle nil and empty checks in api Canonicalize	2020-03-21 17:52:50 -04:00
Jasmine Dahilig	fa19007dfb	update task hook coordinator tests	2020-03-21 17:52:46 -04:00
Jasmine Dahilig	c2ab4c9c90	add test for lifecycle coordinator	2020-03-21 17:52:42 -04:00
Jasmine Dahilig	262d204096	incorporate lifecycle into restart tracker	2020-03-21 17:52:40 -04:00
Mahmood Ali	5377b4cb58	Add a coordinator for alloc runners	2020-03-21 17:52:38 -04:00
Yoan Blanc	77cf2f0573	vendor: vault api and sdk Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-21 17:57:48 +01:00
Mahmood Ali	ac53c110a6	Merge pull request #7236 from hashicorp/b-remove-rkt Remove rkt as a built-in driver	2020-03-17 09:07:35 -04:00
Mahmood Ali	4e4c3873cb	Update gopsutil code Latest gosutil includes two backward incompatible changes: First, it removed unused Stolen field in `cae8efcffa (diff-d9747e2da342bdb995f6389533ad1a3d)` . Second, it updated the Windows cpu stats calculation to be inline with other platforms, where it returns absolate stats rather than percentages. See https://github.com/shirou/gopsutil/pull/611.	2020-03-15 09:37:05 +01:00
Yoan Blanc	f80cbe86a1	gopsutils: v2.20.2 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-15 09:36:59 +01:00
Michael Schurter	64c40af018	Merge pull request #7170 from fredrikhgrelland/consul_template_upgrade Update consul-template to v0.24.1 and remove deprecated vault grace	2020-03-10 14:15:47 -07:00
Mahmood Ali	3284a34b42	Merge pull request #7255 from hashicorp/vendor-update-grpc-20200302 update grpc	2020-03-04 09:32:16 -05:00
Mahmood Ali	d2ddef5ba3	update grpc Upgrade grpc to v1.27.1 and protobuf plugins to v1.3.4.	2020-03-03 08:39:54 -05:00
Mahmood Ali	e812954bd9	Simplify Bootstrap logic in tests This change updates tests to honor `BootstrapExpect` exclusively when forming test clusters and removes test only knobs, e.g. `config.DevDisableBootstrap`. Background: Test cluster creation is fragile. Test servers don't follow the BootstapExpected route like production clusters. Instead they start as single node clusters and then get rejoin and may risk causing brain split or other test flakiness. The test framework expose few knobs to control those (e.g. `config.DevDisableBootstrap` and `config.Bootstrap`) that control whether a server should bootstrap the cluster. These flags are confusing and it's unclear when to use: their usage in multi-node cluster isn't properly documented. Furthermore, they have some bad side-effects as they don't control Raft library: If `config.DevDisableBootstrap` is true, the test server may not immediately attempt to bootstrap a cluster, but after an election timeout (~50ms), Raft may force a leadership election and win it (with only one vote) and cause a split brain. The knobs are also confusing as Bootstrap is an overloaded term. In BootstrapExpect, we refer to bootstrapping the cluster only after N servers are connected. But in tests and the knobs above, it refers to whether the server is a single node cluster and shouldn't wait for any other server. Changes: This commit makes two changes: First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or `DevMode` flags. This change is relatively trivial. Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped. This allows us to keep `BootstrapExpected` immutable. Previously, the flag was a config value but it gets set to 0 after cluster bootstrap completes.	2020-03-02 13:47:43 -05:00
Mahmood Ali	e265e4c7b2	Remove rkt as a built-in driver Rkt has been archived and is no longer an active project: * https://github.com/rkt/rkt * https://github.com/rkt/rkt/issues/4024 The rkt driver will continue to live as an external plugin.	2020-02-26 22:16:41 -05:00
Fredrik Hoem Grelland	26cca14f27	Update consul-template to v0.24.1 and remove deprecated vault_grace (#7170 )	2020-02-23 16:24:53 +01:00
Nick Ethier	3cd4d11efa	Merge pull request #7163 from hashicorp/b-driver-plugin-recovery drivermanager: attempt dispense on reattachment failure	2020-02-21 10:33:20 -05:00
Mahmood Ali	a3b0b25acb	update rest of consul packages	2020-02-16 16:25:04 -06:00
Nick Ethier	64b7c91538	drivermanager: attempt dispense on reattachment failure	2020-02-15 00:50:06 -05:00
Seth Hoenig	1ced8ba47d	Merge pull request #7106 from hashicorp/f-ctag-override client: enable configuring enable_tag_override for services	2020-02-13 12:34:48 -06:00

... 3 4 5 6 7 ...

4254 Commits