Commit Graph

50 Commits

Author SHA1 Message Date
Mahmood Ali
5720266c91 Respect alloc job version for lost/failed allocs
This change fixes a bug where lost/failed allocations are replaced by
allocations with the latest versions, even if the version hasn't been
promoted yet.

Now, when generating a plan for lost/failed allocations, the scheduler
first checks if the current deployment is in Canary stage, and if so, it
ensures that any lost/failed allocations is replaced one with the latest
promoted version instead.
2020-08-19 09:52:48 -04:00
Nick Ethier
60c301758c scheduler: do network feasibility checking for system jobs (#8256) 2020-06-24 16:01:00 -04:00
Nick Ethier
ad8ced3873 multi-interface network support 2020-06-19 09:42:10 -04:00
Nick Ethier
33ce12cda9 CNI Implementation (#7518) 2020-06-18 11:05:29 -07:00
Mahmood Ali
9f11857ad1 Open source Preemption code
Nomad 0.12 OSS is to include preemption feature.

This commit moves the private code for managing preemption to OSS
repository.
2020-05-27 15:02:01 -04:00
Mahmood Ali
5078e0cfed tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle
1af6a2adf1 Wiring algorithm to scheduler calls 2020-05-01 13:13:29 -04:00
Michael Schurter
a61a775b62 core: fix comment on system stack
This makes me do a double take every time I run into it, so what if we
just changed it?
2020-04-09 15:19:11 -07:00
Lang Martin
ce9dbe619f csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Lang Martin
9c9a0c5eb5 csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Lang Martin
f370e25843 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Danielle Lancashire
709abbc675 scheduler: Add a feasability checker for Host Vols 2019-08-12 15:39:08 +02:00
Preetha Appan
4743561396 Refactor scheduler package to enable preemption for batch/service jobs 2019-04-10 20:24:01 -05:00
Preetha Appan
6966e3c3e8 Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan
2143fa2ab7 Use scheduler config from state store to enable/disable preemption 2018-10-30 11:06:32 -05:00
Alex Dadgar
670c7e57dc add to stack 2018-10-13 12:27:49 -07:00
Alex Dadgar
49c2d4f775 Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Preetha Appan
fd697272a7 Implement spread iterator that scores according to percentage of desired count in each target.
Added this as a new step in the stack and some unit tests
2018-09-04 16:10:11 -05:00
Preetha Appan
b5042067e7 Remove unnecessary reset 2018-09-04 16:10:11 -05:00
Preetha Appan
00924555a8 Implement affinity support in generic scheduler 2018-09-04 16:10:11 -05:00
Preetha Appan
a2cdb5d6c0 Add more clarification in comment 2018-01-31 09:58:05 -06:00
Preetha Appan
8d1395ea16 Better score threshold 2018-01-31 09:58:05 -06:00
Preetha Appan
3429dfa716 Limit iterator uses a score threshold and a maxSkip value to be able to skip lower scoring nodes 2018-01-31 09:58:05 -06:00
Preetha Appan
4cbef07d37 Prevent side effect modification of select options when preferred nodes are set 2018-01-31 09:56:53 -06:00
Preetha Appan
c6c0741bd8 Add helper methods, use require and other code review feedback 2018-01-31 09:56:53 -06:00
Preetha Appan
5ecb7895bb Reschedule previous allocs and track their reschedule attempts 2018-01-31 09:56:53 -06:00
Alex Dadgar
f6fbb36054 sync 2017-10-13 14:36:02 -07:00
Alex Dadgar
653a1c37f6 Split distinct property and host iterator and add iterator to system stack 2017-03-08 19:00:10 -08:00
Alex Dadgar
e2ee3f4904 Double the anti-affinity for placing same task group on node 2017-03-06 11:52:53 -08:00
Diptanu Choudhury
a6e0077f72 Implemented SetPrefferingNodes in stack 2016-08-30 16:17:50 -07:00
Diptanu Choudhury
7da66e169c Making the scheduler use LocalDisk instead of Resources.DiskMB 2016-08-25 12:27:42 -05:00
Alex Dadgar
d487295960 Fix computed class when the job has multiple task groups 2016-02-03 21:22:18 -08:00
Alex Dadgar
450252f8ae Respond to comments 2016-01-26 16:43:42 -08:00
Alex Dadgar
0ad3575897 FeasibilityWrapper uses computed node class eligibility to call feasibility checks minimally 2016-01-26 15:16:43 -08:00
Alex Dadgar
2ab5790b6f Rename Dynamic -> ProposedAllocConstraintIterator 2015-10-26 14:12:54 -07:00
Alex Dadgar
9572878a92 Add dynamic constraint to generic_scheduler 2015-10-22 15:09:03 -07:00
Alex Dadgar
2405101328 Remove base nodes from stack constructors 2015-10-16 17:05:23 -07:00
Alex Dadgar
7feb5f1978 Refactor task group constraint logic in generic/system stack 2015-10-16 14:00:51 -07:00
Alex Dadgar
b24f48a4ed System scheduler and system stack 2015-10-14 18:39:44 -07:00
Armon Dadgar
5fb980bc53 scheduler: do not skip job anti-affinity 2015-09-22 22:20:07 -07:00
Armon Dadgar
ca67742fbb scheduler: thread through the TaskResources 2015-09-13 15:20:50 -07:00
Armon Dadgar
924bf123a1 scheduler: binpacker makes network offers 2015-09-13 14:31:32 -07:00
Armon Dadgar
40b84e3023 scheduler: recompute scan limit on SetNodes 2015-09-11 12:03:41 -07:00
Armon Dadgar
efdf717991 scheduler: allow updating the base nodes 2015-09-07 11:30:13 -07:00
Armon Dadgar
f2327acbe1 scheduler: adding job anti-affinity to the generic stack 2015-08-16 10:37:11 -07:00
Armon Dadgar
6a39f5b5da scheduler: adding minor specialization for batch 2015-08-13 22:35:48 -07:00
Armon Dadgar
93fa71609c scheduler: basic metrics integration 2015-08-13 21:46:33 -07:00
Armon Dadgar
4ef65787a8 scheduler: simply stack implementation 2015-08-13 18:44:27 -07:00
Armon Dadgar
6b1ad69da4 scheduler: thread size through 2015-08-13 18:36:13 -07:00
Armon Dadgar
bb39f03ac5 scheduler: refactor stack out 2015-08-13 17:48:26 -07:00