nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-11 21:05:40 +03:00

Author	SHA1	Message	Date
Danielle Lancashire	ab5ba7aa9b	config: Hoist volume.config.source into volume Currently, using a Volume in a job uses the following configuration: ``` volume "alias-name" { type = "volume-type" read_only = true config { source = "host_volume_name" } } ``` This commit migrates to the following: ``` volume "alias-name" { type = "volume-type" source = "host_volume_name" read_only = true } ``` The original design was based due to being uncertain about the future of storage plugins, and to allow maxium flexibility. However, this causes a few issues, namely: - We frequently need to parse this configuration during submission, scheduling, and mounting - It complicates the configuration from and end users perspective - It complicates the ability to do validation As we understand the problem space of CSI a little more, it has become clear that we won't need the `source` to be in config, as it will be used in the majority of cases: - Host Volumes: Always need a source - Preallocated CSI Volumes: Always needs a source from a volume or claim name - Dynamic Persistent CSI Volumes: Always needs a source to attach the volumes to for managing upgrades and to avoid dangling. - Dynamic Ephemeral CSI Volumes: Less thought out, but `source` will probably point to the plugin name, and a `config` block will allow you to pass meta to the plugin. Or will point to a pre-configured ephemeral config. *If implemented The new design simplifies this by merging the source into the volume stanza to solve the above issues with usability, performance, and error handling.	2019-09-13 04:37:59 +02:00
Preetha Appan	654c72a7b4	update comment	2019-09-05 18:43:30 -05:00
Preetha Appan	87e998d043	Fix inplace updates bug with group level networks During inplace updates, we should be using network information from the previous allocation being updated.	2019-09-05 18:37:24 -05:00
Jasmine Dahilig	c346a47b5b	add default update stanza and max_parallel=0 disables deployments (#6191 )	2019-09-02 10:30:09 -07:00
Mahmood Ali	8a0647c9cf	schedulers: check all drivers on node When checking driver feasability for an alloc with multiple drivers, we must check that all drivers are detected and healthy. Nomad 0.9 and 0.8 have a bug where we may check a single driver only, but which driver is dependent on map traversal order, which is unspecified in golang spec.	2019-08-29 09:03:31 -04:00
Mahmood Ali	542d17e745	scheduler: tests for multiple drivers in TG	2019-08-29 09:03:31 -04:00
Danielle Lancashire	41292055de	scheduler: Implicit constraint on readonly hostvol When a Client declares a volume is ReadOnly, we should only schedule it for requests for ReadOnly volumes. This change means that if a host exposes a readonly volume, we then validate that the group level requests for the volume are all read only for that host.	2019-08-21 20:57:05 +02:00
Danielle Lancashire	af5d42c058	structs: Unify Volume and VolumeRequest	2019-08-12 15:39:08 +02:00
Danielle	0f5cf5fa91	Update scheduler/feasible.go Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>	2019-08-12 15:39:08 +02:00
Danielle Lancashire	709abbc675	scheduler: Add a feasability checker for Host Vols	2019-08-12 15:39:08 +02:00
Preetha Appan	5a1dd79179	Code review feedback	2019-07-31 01:04:08 -04:00
Preetha Appan	b561816343	Scheduler changes to support network at task group level Also includes unit tests for binpacker and preemption. The tests verify that network resources specified at the task group level are properly accounted for	2019-07-31 01:04:08 -04:00
Nick Ethier	4cb99a1112	scheduler: fix disk constraints	2019-07-31 01:04:08 -04:00
Nick Ethier	e910fdbb32	fix failing tests	2019-07-31 01:04:07 -04:00
Nick Ethier	e15005bdcb	networking: Add new bridge networking mode implementation	2019-07-31 01:04:06 -04:00
Nick Ethier	c742f8b580	ar: cleanup lint errors	2019-07-31 01:03:18 -04:00
Nick Ethier	e20fa7ccc1	Add network lifecycle management Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.	2019-07-31 01:03:17 -04:00
Lang Martin	2d8bfb8d11	system_sched submits failed evals as blocked	2019-07-18 10:32:12 -04:00
Preetha Appan	bead05f05f	Fix more tests	2019-06-26 16:30:53 -05:00
Preetha Appan	913427428a	Remove compat code associated with many previous versions of nomad This removes compat code for namespaces (0.7), Drain(0.8) and other older features from releases older than Nomad 0.7	2019-06-25 19:05:25 -05:00
Mahmood Ali	2899991ccd	Merge pull request #5790 from hashicorp/b-reschedule-desired-state Mark rescheduled allocs as stopped.	2019-06-13 17:28:59 -04:00
Mahmood Ali	25b44b18db	Test behavior no reschedule for service/batch jobs	2019-06-13 16:41:19 -04:00
Mahmood Ali	34a66835db	Don't stop rescheduleLater allocations When an alloc is due to be rescheduleLater, it goes through the reconciler twice: once to be ignored with a follow up evals, and once again when processing the follow up eval where they appear as rescheduleNow. Here, we ignore them in the first run and mark them as stopped in second iteration; rather than stop them twice.	2019-06-13 09:44:41 -04:00
Mahmood Ali	d342a24ba0	Only preempt for network when there is a network When examining preemption for networks, only consider allocs that have networks. Fixes https://github.com/hashicorp/nomad/issues/5793	2019-06-07 18:55:55 -04:00
Mahmood Ali	2808674fac	test: add tests for network devices and preemption	2019-06-07 18:55:02 -04:00
Mahmood Ali	c62c246ad9	Stop allocs to be rescheduled Currently, when an alloc fails and is rescheduled, the alloc desired state remains as "run" and the nomad client may not free the resources. Here, we ensure that an alloc is marked as stopped when it's rescheduled. Notice the Desired Status and Description before and after this change: Before: ``` mars-2:nomad notnoop$ nomad alloc status 02aba49e ID = 02aba49e Eval ID = bb9ed1d2 Name = example-reschedule.nodes[0] Node ID = 5853d547 Node Name = mars-2.local Job ID = example-reschedule Job Version = 0 Client Status = failed Client Description = Failed tasks Desired Status = run Desired Description = <none> Created = 10s ago Modified = 5s ago Replacement Alloc ID = d6bf872b Task "payload" is "dead" Task Resources CPU Memory Disk Addresses 0/100 MHz 24 MiB/300 MiB 300 MiB Task Events: Started At = 2019-06-06T21:12:45Z Finished At = 2019-06-06T21:12:50Z Total Restarts = 0 Last Restart = N/A Recent Events: Time Type Description 2019-06-06T17:12:50-04:00 Not Restarting Policy allows no restarts 2019-06-06T17:12:50-04:00 Terminated Exit Code: 1 2019-06-06T17:12:45-04:00 Started Task started by client 2019-06-06T17:12:45-04:00 Task Setup Building Task Directory 2019-06-06T17:12:45-04:00 Received Task received by client ``` After: ``` ID = 5001ccd1 Eval ID = 53507a02 Name = example-reschedule.nodes[0] Node ID = a3b04364 Node Name = mars-2.local Job ID = example-reschedule Job Version = 0 Client Status = failed Client Description = Failed tasks Desired Status = stop Desired Description = alloc was rescheduled because it failed Created = 13s ago Modified = 3s ago Replacement Alloc ID = 7ba7ac20 Task "payload" is "dead" Task Resources CPU Memory Disk Addresses 21/100 MHz 24 MiB/300 MiB 300 MiB Task Events: Started At = 2019-06-06T21:22:50Z Finished At = 2019-06-06T21:22:55Z Total Restarts = 0 Last Restart = N/A Recent Events: Time Type Description 2019-06-06T17:22:55-04:00 Not Restarting Policy allows no restarts 2019-06-06T17:22:55-04:00 Terminated Exit Code: 1 2019-06-06T17:22:50-04:00 Started Task started by client 2019-06-06T17:22:50-04:00 Task Setup Building Task Directory 2019-06-06T17:22:50-04:00 Received Task received by client ```	2019-06-06 17:27:12 -04:00
Mahmood Ali	5574b2f3d0	tests: Migrated allocs aren't lost Fix `TestServiceSched_NodeDown` for checking that the migrated allocs are actually marked to be stopped. The boolean logic in test made it skip actually checking client status as long as desired status was stop. Here, we mark some jobs for migration while leaving others as running, and we check that lost flag is only set for non-migrated allocs.	2019-06-06 16:05:07 -04:00
Lang Martin	21dccdf8dd	describe a pending deployment with auto_promote accurately	2019-05-22 12:32:08 -04:00
Lang Martin	2165f8be94	sched reconcile copy AutoPromote to DeploymentState	2019-05-22 12:32:08 -04:00
Preetha Appan	566dd71486	Fix comment and assert score in test case	2019-05-15 12:35:57 -05:00
Nick Ethier	5709bf7b54	fix missing brace	2019-05-15 13:02:04 -04:00
Nick Ethier	ea843a507a	scheduler: add check to prohibit returning inf during spread boost calculation	2019-05-15 13:00:24 -04:00
Lang Martin	b1228536e8	system_sched & test cleanup comments	2019-05-01 12:25:26 -04:00
Lang Martin	d1308420b2	system_sched_test extend the test to check ineligible nodes	2019-05-01 12:25:26 -04:00
Lang Martin	8cc8fc6200	system_sched when a node is filtered, don't mark failure	2019-05-01 12:25:26 -04:00
Lang Martin	178092591b	system_sched_test create partially constrained job	2019-05-01 12:25:26 -04:00
Arshneet Singh	ee268a58db	Add comments to functions, and use require instead of assert	2019-04-23 09:57:21 -07:00
Arshneet Singh	97686e371f	Remove allowPlanOptimization from schedulers	2019-04-23 09:18:02 -07:00
Arshneet Singh	02b832c3ff	Compat tags	2019-04-23 09:18:01 -07:00
Arshneet Singh	f75c6b4bdb	Add tests for plan normalization	2019-04-23 09:18:01 -07:00
Arshneet Singh	4eedab18a7	Add code for plan normalization	2019-04-23 09:18:01 -07:00
Danielle	9a4fe5e98f	Merge pull request #5512 from hashicorp/dani/f-alloc-stop alloc-lifecycle: nomad alloc stop	2019-04-23 13:05:08 +02:00
Danielle Lancashire	bb142af5d6	allocs: Add nomad alloc stop This adds a `nomad alloc stop` command that can be used to stop and force migrate an allocation to a different node. This is built on top of the AllocUpdateDesiredTransitionRequest and explicitly limits the scope of access to that transition to expose it under the alloc-lifecycle ACL. The API returns the follow up eval that can be used as part of monitoring in the CLI or parsed and used in an external tool.	2019-04-23 12:50:23 +02:00
Preetha Appan	a134c16c22	remove stray new line	2019-04-12 10:32:48 -05:00
Preetha Appan	4743561396	Refactor scheduler package to enable preemption for batch/service jobs	2019-04-10 20:24:01 -05:00
James Rasell	ee92bf86d8	Add NodeName to the alloc/job status outputs. Currently when operators need to log onto a machine where an alloc is running they will need to perform both an alloc/job status call and then a call to discover the node name from the node list. This updates both the job status and alloc status output to include the node name within the information to make operator use easier. Closes #2359 Cloess #1180	2019-04-10 10:34:10 -05:00
Preetha Appan	1323b4d5cc	Fix bug where scoring metadata would be overridden during an inplace upgrade.	2019-03-12 23:36:46 -05:00
Alex Dadgar	bc42873e07	Change types of weights on spread/affinity	2019-01-30 12:20:38 -08:00
Nick Ethier	80a04052b6	scheduler: fix NPE when deployment is nil, but placement is a canary	2019-01-28 20:22:59 -06:00
Alex Dadgar	8264f50c52	convert driver to device for device constraint/attributes	2019-01-23 10:58:45 -08:00

1 2 3 4 5 ...

616 Commits