nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-08 19:35:41 +03:00

Author	SHA1	Message	Date
Tim Gross	c49359ad58	scheduler: prevent panic in spread iterator during alloc stop The spread iterator can panic when processing an evaluation, resulting in an unrecoverable state in the cluster. Whenever a panicked server restarts and quorum is restored, the next server to dequeue the evaluation will panic. To trigger this state: * The job must have `max_parallel = 0` and a `canary >= 1`. * The job must not have a `spread` block. * The job must have a previous version. * The previous version must have a `spread` block and at least one failed allocation. In this scenario, the desired changes include `(place 1+) (stop 1+), (ignore n) (canary 1)`. Before the scheduler can place the canary allocation, it tries to find out which allocations can be stopped. This passes back through the stack so that we can determine previous-node penalties, etc. We call `SetJob` on the stack with the previous version of the job, which will include assessing the `spread` block (even though the results are unused). The task group spread info state from that pass through the spread iterator is not reset when we call `SetJob` again. When the new job version iterates over the `groupPropertySets`, it will get an empty `spreadAttributeMap`, resulting in an unexpected nil pointer dereference. This changeset resets the spread iterator internal state when setting the job, logging with a bypass around the bug in case we hit similar cases, and a test that panics the scheduler without the patch.	2022-02-09 19:53:06 -05:00
Preetha Appan	be897cadc3	More error->debug for logging in the bin packing iterator	2019-12-12 15:50:16 -06:00
Preetha Appan	566dd71486	Fix comment and assert score in test case	2019-05-15 12:35:57 -05:00
Nick Ethier	5709bf7b54	fix missing brace	2019-05-15 13:02:04 -04:00
Nick Ethier	ea843a507a	scheduler: add check to prohibit returning inf during spread boost calculation	2019-05-15 13:00:24 -04:00
Alex Dadgar	bc42873e07	Change types of weights on spread/affinity	2019-01-30 12:20:38 -08:00
Alex Dadgar	260b566c91	server	2018-09-15 16:23:13 -07:00
Preetha Appan	f6cbfbfef6	Track top k nodes by norm score rather than top k nodes per scorer	2018-09-04 16:10:11 -05:00
Preetha Appan	72570e0698	fix linting error	2018-09-04 16:10:11 -05:00
Preetha Appan	31b2102055	Fix scoring logic for uneven spread to incorporate current alloc count Also addressed other small code review comments	2018-09-04 16:10:11 -05:00
Preetha Appan	1ac696da56	more cleanup	2018-09-04 16:10:11 -05:00
Preetha Appan	fc48be3656	added some unit tests for -1 spread score	2018-09-04 16:10:11 -05:00
Preetha Appan	f881c4f266	comment and formatting cleanup	2018-09-04 16:10:11 -05:00
Preetha Appan	2dfdd4874f	fix scoring algorithm when min count == current count	2018-09-04 16:10:11 -05:00
Preetha Appan	35bda8c975	Remove hardcoded boosts for even spread. instead, calculate them based on delta between current and minimum value	2018-09-04 16:10:11 -05:00
Preetha Appan	7a5791f39e	Implement support for even spread across datacenters, with unit test	2018-09-04 16:10:11 -05:00
Preetha Appan	56de0d0a11	Support implicit spread target to account for remaining desired counts	2018-09-04 16:10:11 -05:00
Preetha Appan	5f1d40e4c3	fix comments	2018-09-04 16:10:11 -05:00
Preetha Appan	bf84a5985a	Include spreads configured at job level when precomputing weights/desired counts.	2018-09-04 16:10:11 -05:00
Preetha Appan	fd697272a7	Implement spread iterator that scores according to percentage of desired count in each target. Added this as a new step in the stack and some unit tests	2018-09-04 16:10:11 -05:00

20 Commits