nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 02:15:43 +03:00

Author	SHA1	Message	Date
Seth Hoenig	a97254fa20	consul: plubming for specifying consul namespace in job/group This PR adds the common OSS changes for adding support for Consul Namespaces, which is going to be a Nomad Enterprise feature. There is no new functionality provided by this changeset and hopefully no new bugs.	2021-04-05 10:03:19 -06:00
Chris Baker	80066ac798	Merge branch 'main' into f-node-drain-api	2021-04-01 15:22:57 -05:00
Mahmood Ali	80faab0f79	oversubscription: Add MemoryMaxMB to internal structs Start tracking a new MemoryMaxMB field that represents the maximum memory a task may use in the client. This allows tasks to specify a memory reservation (to be used by scheduler when placing the task) but use excess memory used on the client if the client has any. This commit adds the server tracking for the value, and ensures that allocations AllocatedResource fields include the value.	2021-03-30 16:55:58 -04:00
Nick Ethier	0a25e153c7	Merge pull request #10203 from hashicorp/f-cpu-cores Reserved Cores [1/4]: Structs and scheduler implementation	2021-03-29 14:05:54 -04:00
Chris Baker	a52f32dedc	restored Node.Sanitize() for RPC endpoints multiple other updates from code review	2021-03-26 17:03:15 +00:00
Chris Baker	93d5187e7b	removed deprecated fields from Drain structs and API node drain: use msgtype on txn so that events are emitted wip: encoding extension to add Node.Drain field back to API responses new approach for hiding Node.SecretID in the API, using `json` tag documented this approach in the contributing guide refactored the JSON handlers with extensions modified event stream encoding to use the go-msgpack encoders with the extensions	2021-03-21 15:30:11 +00:00
Nick Ethier	39a8a172eb	scheduler: detect job change in cores resource	2021-03-19 22:25:50 -04:00
Tim Gross	7c7569674c	CSI: unique volume per allocation Add a `PerAlloc` field to volume requests that directs the scheduler to test feasibility for volumes with a source ID that includes the allocation index suffix (ex. `[0]`), rather than the exact source ID. Read the `PerAlloc` field when making the volume claim at the client to determine if the allocation index suffix (ex. `[0]`) should be added to the volume source ID.	2021-03-18 15:35:11 -04:00
Seth Hoenig	05c3ffde9d	consul/connect: correctly detect when connect tasks not updated This PR fixes a bug where tasks with Connect services could be triggered to destructively update (i.e. placed in a new alloc) when no update should be necessary. Fixes #10077	2021-02-23 15:12:49 -06:00
Nick Ethier	a957a5cd20	Merge pull request #9937 from hashicorp/b-9728 scheduler: add tests and fix for detected host_network and to port field changes	2021-02-02 13:54:41 -05:00
Nick Ethier	ee7a5369b1	scheduler: add tests and fix for detected host_network and to port field changes	2021-02-01 15:56:43 -05:00
Drew Bailey	182aa4a459	Persist shared allocated ports for inplace update (#9830 ) * Persist shared allocated ports for inplace update Ports were not copied over when performing inplace updates in the generic scheduler * changelog * drop spew	2021-01-15 12:45:12 -05:00
Drew Bailey	9c3ce6b6dc	persist shared ports during inplace updates (#9736 ) AllocatedSharedResources were not being copied over to the new allocation struct the scheduler makes during inplace updates. This caused downstream issues after the plan was applied, namely the shared ports were dropped causing issues with service registration/deregistration. test that shared ports are preserved change log, also carry over shared network copy networks	2021-01-08 09:00:41 -05:00
Seth Hoenig	37cd0eced3	consul/connect: trigger update as necessary on connect changes This PR fixes a long standing bug where submitting jobs with changes to connect services would not trigger updates as expected. Previously, service blocks were not considered as sources of destructive updates since they could be synced with consul non-destructively. With Connect, task group services that have changes to their connect block or to the service port should be destructive, since the network plumbing of the alloc is going to need updating. Fixes #8596 #7991 Non-destructive half in #7192	2020-10-05 14:53:00 -05:00
Mahmood Ali	cb038b1a8c	Have Plan.AppendAlloc accept the job	2020-08-25 17:22:09 -04:00
Mahmood Ali	5720266c91	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Nick Ethier	e9ff8a8daa	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Lang Martin	9ccec0afbb	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Lang Martin	cd6d34425f	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Mahmood Ali	04000c9ba9	Ensure that alloc updates preserve device offers When an alloc is updated in-place, ensure that the allocated device are preserved and carried over to new alloc.	2020-04-21 08:57:15 -04:00
Mahmood Ali	08e5a9087f	Merge pull request #7414 from hashicorp/b-network-mode-change Detect network mode change	2020-03-24 09:46:40 -04:00
Mahmood Ali	7897104b72	update scheduler to account for hooks	2020-03-21 17:52:45 -04:00
Mahmood Ali	82cb13aac9	Detect network mode change Mark job as updated if network mode changed.	2020-03-21 16:51:10 -04:00
Drew Bailey	a880d75b16	comment for filtering reason	2020-02-03 09:02:09 -05:00
Drew Bailey	cd00d6ded5	make diffSystemAllocsForNode aware of eligibility diffSystemAllocs -> diffSystemAllocsForNode, this function is only used for diffing system allocations, but lacked awareness of eligible nodes and the node ID that the allocation was going to be placed. This change now ignores a change if its existing allocation is on an ineligible node. For a new allocation, it also checks tainted and ineligible nodes in the same function instead of nil-ing out the diff after computation in diffSystemAllocs	2020-02-03 09:02:08 -05:00
Drew Bailey	580baea231	ignore computed diffs if node is ineligible test flakey, add temp sleeps for debugging fix computed class	2020-02-03 09:02:08 -05:00
Drew Bailey	81a24098f0	Update Evicted allocations to lost when lost If an alloc is being preempted and marked as evict, but the underlying node is lost before the migration takes place, the allocation currently stays as desired evict, status running forever, or until the node comes back online. This commit updates updateNonTerminalAllocsToLost to check for a destired status of Evict as well as Stop when updating allocations on tainted nodes. switch to table test for lost node cases	2020-01-07 13:34:18 -05:00
Drew Bailey	89964c989a	Removes checking constraints for inplace update	2019-11-19 13:34:41 -05:00
Drew Bailey	c87c6415eb	DOCS: Spread stanza does not exist on task Fixes documentation inaccuracy for spread stanza placement. Spreads can only exist on the top level job struct or within a group. comment about nil assumption	2019-11-19 08:26:36 -05:00
Drew Bailey	1607a203a8	Check for changes to affinity and constraints Adds checks for affinity and constraint changes when determining if we should update inplace. refactor to check all levels at once check for spread changes when checking inplace update	2019-11-19 08:26:34 -05:00
Chris Baker	bcd1243471	the scheduler checks whether task changes require a restart, this needed to be updated to consider devices	2019-11-07 17:51:15 +00:00
Michael Schurter	08a17854ce	core: fix panic when AllocatedResources is nil Fix for #6540	2019-10-28 14:38:21 -07:00
Preetha Appan	654c72a7b4	update comment	2019-09-05 18:43:30 -05:00
Preetha Appan	87e998d043	Fix inplace updates bug with group level networks During inplace updates, we should be using network information from the previous allocation being updated.	2019-09-05 18:37:24 -05:00
Nick Ethier	4cb99a1112	scheduler: fix disk constraints	2019-07-31 01:04:08 -04:00
Nick Ethier	e910fdbb32	fix failing tests	2019-07-31 01:04:07 -04:00
Nick Ethier	e15005bdcb	networking: Add new bridge networking mode implementation	2019-07-31 01:04:06 -04:00
Nick Ethier	c742f8b580	ar: cleanup lint errors	2019-07-31 01:03:18 -04:00
Nick Ethier	e20fa7ccc1	Add network lifecycle management Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.	2019-07-31 01:03:17 -04:00
Arshneet Singh	4eedab18a7	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Preetha Appan	1323b4d5cc	Fix bug where scoring metadata would be overridden during an inplace upgrade.	2019-03-12 23:36:46 -05:00
Alex Dadgar	0953d913ed	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Alex Dadgar	5e67b37aad	use int64	2018-10-16 15:34:32 -07:00
Preetha Appan	3ca71ae935	Change CPU/Disk/MemoryMB to int everywhere in new resource structs	2018-10-16 16:21:42 -05:00
Alex Dadgar	0f2f4797cb	fixing tests	2018-10-04 14:26:19 -07:00
Alex Dadgar	49c2d4f775	Scheduler uses allocated resources	2018-10-02 17:08:25 -07:00
Alex Dadgar	260b566c91	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	05d183bbab	Fix lost handling of not actually down nodes	2018-03-30 14:17:41 -07:00
Alex Dadgar	3099ef05e2	Unmark drain when nodes hit their deadline and only batch/system left and add all job type integration test	2018-03-28 17:25:58 -07:00
Alex Dadgar	95c3c637ba	Correct status desc on draining system allocs	2018-03-26 17:54:46 -07:00

1 2 3

138 Commits