nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Piotr Kazmierczak	199d12865f	scheduler: isolate `feasibility` (#26031 ) This change isolates all the code that deals with node selection in the scheduler into its own package called feasible. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-06-11 20:11:04 +02:00
Piotr Kazmierczak	58c6387323	stateful deployments: task group host volume claims API (#25114 ) This PR introduces API endpoints /v1/volumes/claims/ and /v1/volumes/claim/:id for listing and deleting task group host volume claims, respectively.	2025-02-25 15:51:59 +01:00
Piotr Kazmierczak	611452e1af	stateful deployments: use `TaskGroupVolumeClaim` table to associate volume requests with volume IDs (#24993 ) We introduce an alternative solution to the one presented in #24960 which is based on the state store and not previous-next allocation tracking in the reconciler. This new solution reduces cognitive complexity of the scheduler code at the cost of slightly more boilerplate code, but also opens up new possibilities in the future, e.g., allowing users to explicitly "un-stick" volumes with workloads still running. The diagram below illustrates the new logic: SetVolumes() upsertAllocsImpl() sets ns, job +-----------------checks if alloc requests tg in the scheduler v sticky vols and consults \| +-----------------------+ state. If there is no claim, \| \| TaskGroupVolumeClaim: \| it creates one. \| \| - namespace \| \| \| - jobID \| \| \| - tg name \| \| \| - vol ID \| v \| uniquely identify vol \| hasVolumes() +----+------------------+ consults the state \| ^ and returns true \| \| DeleteJobTxn() if there's a match <-----------+ +---------------removes the claim from or if there is no the state previous claim \| \| \| \| +-----------------------------+ +------------------------------------------------------+ scheduler state store	2025-02-07 17:41:01 +01:00
Piotr Kazmierczak	8cbb74786c	stateful deployments: find feasible node for sticky host volumes (#24558 ) This changeset implements node feasibility checks for sticky host volumes.	2024-12-19 09:25:55 -05:00
Tim Gross	3143019d85	dynamic host volumes: capabilities check during scheduling (#24617 ) Static host volumes have a simple readonly toggle, but dynamic host volumes have a more complex set of capabilities similar to CSI volumes. Update the feasibility checker to account for these capabilities and volume readiness. Also fixes a minor bug in the state store where a soft-delete (not yet implemented) could cause a volume to be marked ready again. This is needed to support testing the readiness checking in the scheduler. Ref: https://github.com/hashicorp/nomad/pull/24479	2024-12-19 09:25:54 -05:00
hashicorp-copywrite[bot]	a9d61ea3fd	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:29 -05:00
Luiz Aoqui	f4c7182873	node pools: apply node pool scheduler configuration (#17598 )	2023-06-21 20:31:50 -04:00
Tim Gross	9a6078a2ae	node pools: implement support in scheduler (#17443 ) Implement scheduler support for node pool: * When a scheduler is invoked, we get a set of the ready nodes in the DCs that are allowed for that job. Extend the filter to include the node pool. * Ensure that changes to a job's node pool are picked up as destructive allocation updates. * Add `NodesInPool` as a metric to all reporting done by the scheduler. * Add the node-in-pool the filter to the `Node.Register` RPC so that we don't generate spurious evals for nodes in the wrong pool.	2023-06-07 10:39:03 -04:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Tim Gross	1b434184e4	make version checks specific to region (1.4.x) (#14912 ) * One-time tokens are not replicated between regions, so we don't want to enforce that the version check across all of serf, just members in the same region. * Scheduler: Disconnected clients handling is specific to a single region, so we don't want to enforce that the version check across all of serf, just members in the same region. * Variables: enforce version check in Apply RPC * Cleans up a bunch of legacy checks. This changeset is specific to 1.4.x and the changes for previous versions of Nomad will be manually backported in a separate PR.	2022-10-17 16:23:51 -04:00
Derek Strickland	6329f44148	disconnected clients: ensure servers meet minimum required version (#12202 ) * planner: expose ServerMeetsMinimumVersion via Planner interface * filterByTainted: add flag indicating disconnect support * allocReconciler: accept and pass disconnect support flag * tests: update dependent tests	2022-04-05 17:12:23 -04:00
Tim Gross	b0b7a49439	scheduler: seed random shuffle nodes with eval ID (#12008 ) Processing an evaluation is nearly a pure function over the state snapshot, but we randomly shuffle the nodes. This means that developers can't take a given state snapshot and pass an evaluation through it and be guaranteed the same plan results. But the evaluation ID is already random, so if we use this as the seed for shuffling the nodes we can greatly reduce the sources of non-determinism. Unfortunately golang map iteration uses a global source of randomness and not a goroutine-local one, but arguably if the scheduler behavior is impacted by this, that's a bug in the iteration.	2022-02-08 12:16:33 -05:00
Luiz Aoqui	8a427a470a	scheduler: detect and log unexpected scheduling collisions (#11793 )	2022-01-14 20:09:14 -05:00
Seth Hoenig	61ee443ee6	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Tim Gross	a1eaad9cf7	CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158 ) Callers of `CSIVolumeByID` are generally assuming they should receive a single volume. This potentially results in feasibility checking being performed against the wrong volume if a volume's ID is a prefix substring of other volume (for example: "test" and "testing"). Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix matching in the command line client. Add the required elements for prefix matching to the commands and API.	2021-03-18 14:32:40 -04:00
Mahmood Ali	5720266c91	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Lang Martin	5b010fab10	csi: use node MaxVolumes during scheduling (#7565 ) * nomad/state/state_store: CSIVolumesByNodeID ignores namespace * scheduler/scheduler: add CSIVolumesByNodeID to the state interface * scheduler/feasible: check node MaxVolumes * nomad/csi_endpoint: no namespace inn CSIVolumesByNodeID anymore * nomad/state/state_store: avoid DenormalizeAllocationSlice * nomad/state/iterator: clean up SliceIterator Next * scheduler/feasible_test: block with MaxVolumes * nomad/state/state_store_test: fix args to CSIVolumesByNodeID	2020-03-31 17:16:47 -04:00
Lang Martin	ce9dbe619f	csi: the scheduler allows a job with a volume write claim to be updated (#7438 ) * nomad/structs/csi: split CanWrite into health, in use * scheduler/scheduler: expose AllocByID in the state interface * nomad/state/state_store_test * scheduler/stack: SetJobID on the matcher * scheduler/feasible: when a volume writer is in use, check if it's us * scheduler/feasible: remove SetJob * nomad/state/state_store: denormalize allocs before Claim * nomad/structs/csi: return errors on claim, with context * nomad/csi_endpoint_test: new alloc doesn't look like an update * nomad/state/state_store_test: change test reference to CanWrite	2020-03-23 21:21:04 -04:00
Lang Martin	9c9a0c5eb5	csi: volume ids are only unique per namespace (#7358 ) * nomad/state/schema: use the namespace compound index * scheduler/scheduler: CSIVolumeByID interface signature namespace * scheduler/stack: SetJob on CSIVolumeChecker to capture namespace * scheduler/feasible: pass the captured namespace to CSIVolumeByID * nomad/state/state_store: use namespace in csi_volume index * nomad/fsm: pass namespace to CSIVolumeDeregister & Claim * nomad/core_sched: pass the namespace in volumeClaimReap * nomad/node_endpoint_test: namespaces in Claim testing * nomad/csi_endpoint: pass RequestNamespace to state.* * nomad/csi_endpoint_test: appropriately failed test * command/alloc_status_test: appropriately failed test * node_endpoint_test: avoid notTheNamespace for the job * scheduler/feasible_test: call SetJob to capture the namespace * nomad/csi_endpoint: ACL check the req namespace, query by namespace * nomad/state/state_store: remove deregister namespace check * nomad/state/state_store: remove unused CSIVolumes * scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace * nomad/csi_endpoint: ACL check * nomad/state/state_store_test: remove call to state.CSIVolumes * nomad/core_sched_test: job namespace match so claim gc works	2020-03-23 13:59:25 -04:00
Lang Martin	f370e25843	CSI: Scheduler knows about CSI constraints and availability (#6995 ) * structs: piggyback csi volumes on host volumes for job specs * state_store: CSIVolumeByID always includes plugins, matches usecase * scheduler/feasible: csi volume checker * scheduler/stack: add csi volumes * contributing: update rpc checklist * scheduler: add volumes to State interface * scheduler/feasible: introduce new checker collection tgAvailable * scheduler/stack: taskGroupCSIVolumes checker is transient * state_store CSIVolumeDenormalizePlugins comment clarity * structs: remote TODO comment in TaskGroup Validate * scheduler/feasible: CSIVolumeChecker hasPlugins improve comment * scheduler/feasible_test: set t.Parallel * Update nomad/state/state_store.go Co-Authored-By: Danielle <dani@hashicorp.com> * Update scheduler/feasible.go Co-Authored-By: Danielle <dani@hashicorp.com> * structs: lift ControllerRequired to each volume * state_store: store plug.ControllerRequired, use it for volume health * feasible: csi match fast path remove stale host volume copied logic * scheduler/feasible: improve comments Co-authored-by: Danielle <dani@builds.terrible.systems>	2020-03-23 13:58:29 -04:00
Alex Dadgar	95297c608c	goimports	2019-01-22 15:44:31 -08:00
Preetha Appan	2143fa2ab7	Use scheduler config from state store to enable/disable preemption	2018-10-30 11:06:32 -05:00
Alex Dadgar	260b566c91	server	2018-09-15 16:23:13 -07:00
Josh Soref	a3a4bdb9ae	spelling: commits	2018-03-11 17:47:45 +00:00
Alex Dadgar	f6fbb36054	sync	2017-10-13 14:36:02 -07:00
Alex Dadgar	ac1539d5d9	Sync namespace changes	2017-09-07 17:04:21 -07:00
Alex Dadgar	bb6524151a	cancel deployments	2017-07-07 12:01:17 -07:00
Alex Dadgar	5b75b29af4	Nomad builds	2017-02-07 20:31:23 -08:00
Diptanu Choudhury	5eb4e8adb3	Making the status command return the allocs of currently registered job	2016-11-24 16:31:30 +01:00
Alex Dadgar	962f4d4fe6	Add scheduler version enforcement	2016-10-26 14:52:48 -07:00
Alex Dadgar	08b2291313	Reuse the same evaluation and reblock it until there is no more work to do	2016-05-24 20:10:56 -07:00
Armon Dadgar	bc9423e61b	scheduler: Use AllocsByNodeTerminal to avoid filtering	2016-02-20 11:29:15 -08:00
Alex Dadgar	4718a05d07	Revert "dont hard code scheduler type name" This reverts commit `fb0e0dfc2b`.	2015-10-23 16:32:45 -07:00
Alex Dadgar	fb0e0dfc2b	dont hard code scheduler type name	2015-10-23 16:31:45 -07:00
Alex Dadgar	b24f48a4ed	System scheduler and system stack	2015-10-14 18:39:44 -07:00
Armon Dadgar	1b2cb12312	scheduler: Adding CreateEval to Planner	2015-09-07 14:26:29 -07:00
Armon Dadgar	b566efd781	nomad: unifying the state store API	2015-09-06 20:56:38 -07:00
Armon Dadgar	50fd49cdbc	nomad: expose UpdateEval as a planner	2015-08-15 14:25:00 -07:00
Armon Dadgar	5f235181c8	nomad: remove NodesByDatacenterStatus	2015-08-15 13:11:42 -07:00
Armon Dadgar	9217fb4347	scheduler: ServiceScheduler is now GenericScheduler with service and batch modes	2015-08-13 22:28:37 -07:00
Armon Dadgar	ec4712d157	nomad: adding NodesByDatacenterStatus	2015-08-13 13:17:03 -07:00
Armon Dadgar	96228795fe	scheduler: working on bin pack	2015-08-13 11:54:59 -07:00
Armon Dadgar	85938507aa	scheduler: working on job updates	2015-08-11 16:41:48 -07:00
Armon Dadgar	9c17cceaa3	scheduler: check node status before evicting	2015-08-11 14:04:04 -07:00
Armon Dadgar	94454cbb3b	scheduler: first pass at job deregister	2015-08-06 17:46:14 -07:00
Armon Dadgar	95c4081311	scheduler: adding service scheduler definition	2015-08-06 17:25:14 -07:00
Armon Dadgar	64aeb26128	nomad: integrating worker and scheduler	2015-07-28 16:15:32 -07:00
Armon Dadgar	447a7f8b47	scheduler: initial interface	2015-07-28 16:03:15 -07:00

48 Commits