nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 02:15:43 +03:00

Author	SHA1	Message	Date
Andy Assareh	20bbdba041	Mesh Gateway doc enhancements (#11354 ) * Mesh Gateway doc enhancements 1. I believe this line should be corrected to add mesh as one of the choices 2. I found that we are not setting this meta, and it is a required element for wan federation. I believe it would be helpful and potentially time saving to note that right here.	2021-12-20 17:10:44 -05:00
Luiz Aoqui	55018bdfe6	docs: add v1.2.0 upgrade guide about Nomad UI ACL change for job details page (#11689 )	2021-12-16 14:32:20 -05:00
Luiz Aoqui	15db86a6af	docs: add more references and examples to the `template` block (#11691 )	2021-12-16 14:14:01 -05:00
Tim Gross	03ea7d1c17	cli: unhide advanced operator raft debugging commands (#11682 ) The `nomad operator raft` and `nomad operator snapshot state` subcommands for inspecting on-disk raft state were hidden and undocumented. Expose and document these so that advanced operators have support for these tools.	2021-12-16 10:32:11 -05:00
Tim Gross	97621ec3c5	`nomad eval list` command (#11675 ) Use the new filtering and pagination capabilities of the `Eval.List` RPC to provide filtering and pagination at the command line. Also includes note that `nomad eval status -json` is deprecated and will be replaced with a single evaluation view in a future version of Nomad.	2021-12-15 11:58:38 -05:00
Noel Quiles	608cdfc71d	website: Copy updates (#11677 )	2021-12-14 16:35:21 -05:00
Tim Gross	35c22bcb6c	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Kevin Wang	ddca508b0d	feat(website): extract `/plugins` `/tools` docs (#11584 ) Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> Co-authored-by: Mike Nomitch <mnomitch@hashicorp.com>	2021-12-09 14:25:18 -05:00
Lukas W	bb1fc6ec5f	CLI: Return non-zero exit code when deployment fails in `nomad run` (#11550 ) * Exit non-zero from run command if deployment fails * Fix typo in deployment monitor introduced in `0edda11`	2021-12-09 09:09:28 -05:00
Tim Gross	95fa1b30f4	docs: improve docs for troubleshooting and monitoring scheduler (#11623 ) This changeset adds more specific recommendations as to what metrics to monitor, and what resources should be examined during incident response. It also renames the "Telemetry" section to "Monitoring Nomad" to surface the material better and distinguish it from the "Metric Reference". Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>	2021-12-07 15:52:13 -05:00
James Rasell	d2132b96b4	docs: add license expiry metric to metrics website doc.	2021-12-07 10:31:51 +00:00
Shantanu Gadgil	d174a38206	mention `sysbatch` in addition to `batch` (#11587 )	2021-12-06 19:12:03 -05:00
Tim Gross	2c3db7ee1d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Tim Gross	216d4f8644	ui: change Consul/Vault base URL field name (#11589 ) Give ourselves some room for extension in the UI configuration block by naming the field `ui_url`, which will let us have an `api_url`. Fix the template path to ensure we're getting the right value from the API.	2021-11-30 13:20:29 -05:00
Tim Gross	851ed6322f	docs: `mount_flags` takes a slice of strings (#11583 ) The `mount_flags` option takes a slice of strings, not a comma-separated string like the flags passed to `mount(8)`.	2021-11-29 10:07:34 -05:00
Luiz Aoqui	6ad6ad67fe	docs: document new Prometheus configuration for the Autoscaler APM plugin (#11562 )	2021-11-24 17:37:35 -05:00
Luiz Aoqui	f9871fbae1	docs: add CLI and config docs for the Autoscaler policy source config (#11559 )	2021-11-24 16:17:37 -05:00
Luiz Aoqui	9a0fee0a17	docs: add upgrade guide notes for Nomad 1.2.2 (#11567 )	2021-11-24 14:24:20 -05:00
Tim Gross	1160817731	config: UI configuration block with Vault/Consul links (#11555 ) Add `ui` block to agent configuration to enable/disable the web UI and provide the web UI with links to Vault/Consul.	2021-11-24 11:20:02 -05:00
James Rasell	416b14ecef	Merge pull request #11535 from hashicorp/docs-vault-token docs: clarify vault.token only required on servers	2021-11-23 09:26:06 +01:00
James Rasell	80dcae7216	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Luiz Aoqui	9dd93990c5	Merge tag 'v1.2.1' into merge-release-1.2.1-branch Version 1.2.1	2021-11-22 10:47:04 -05:00
Tim Gross	40de248b94	qemu: add `args_allowlist` to sandbox VM command line inputs The QEMU driver allows arbitrary command line options, but many of these options give access to host resources that operators may not want to expose such as devices. Add an optional allowlist to the plugin configuration so that operators can limit the resources for QEMU.	2021-11-19 11:11:52 -05:00
Michael Schurter	5204aa7e46	docs: clarify vault.token only required on servers While it is clarified toward the bottom of this page, I've seen people go to great lengths to configure tokens for clients anyway, so I think it's worth noting on the parameter's docs as well.	2021-11-18 16:34:59 -08:00
Luiz Aoqui	18ce6caac7	docs: add note about the Nomad APM autoscaling plugin and scaling cluster to zero (#11494 )	2021-11-16 11:58:26 -05:00
Luiz Aoqui	7cbdcd11cc	docs: remove mutual-exclusion between node class and datacenter in scaling policies (#11499 )	2021-11-16 11:58:14 -05:00
kfenech1	6bbcb180f2	docs: `nomad.client.unallocated.memory` is in Megabytes not bytes (#11468 )	2021-11-08 11:05:11 -05:00
Florian Apolloner	b52f42db9a	Added a `-hcl2-strict` flag to allow for lenient hcl variable parsing. (#11284 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2021-11-04 16:33:09 +01:00
James Rasell	8662dd8335	Merge pull request #11333 from hashicorp/assareh-patch-1 exactly one of ingress, terminating, or mesh must be configured	2021-11-04 11:13:04 +01:00
Michael Schurter	ba7694855d	Merge pull request #11334 from hashicorp/f-chroot-skip-allocdir client: never embed alloc_dir in chroot	2021-11-03 16:48:09 -07:00
Luiz Aoqui	0e3cd604d2	docs: update podman driver documentation (#11300 )	2021-11-03 11:07:44 -04:00
James Rasell	394cf3ce46	Merge pull request #11425 from hashicorp/b-add-timeout-consul-docs docs: document Consul timeout config parameter.	2021-11-02 15:28:34 +01:00
James Rasell	6daf5db3a9	docs: document Consul timeout config parameter.	2021-11-02 08:28:45 +01:00
James Rasell	35a6d76d4d	docs: update acl bootstrap command to show json and template opts.	2021-10-29 09:01:58 +02:00
Dave May	f46b97b2df	debug: update default node-id and docs (#11398 ) * debug: default node-id to all * debug: align cli help and website documentation	2021-10-27 13:43:56 -04:00
Mike Nomitch	3025ae6087	Replaces accidental use of Vault with Nomad (#11355 )	2021-10-27 08:35:31 -07:00
Luiz Aoqui	1fbe88fbd6	docs: add note and example of storing `nomad job plan` index to disk (#11377 )	2021-10-26 20:25:22 -04:00
Charlie Voiselle	dce23e829f	DOCS: Update Consul Connect to Consul service mesh (#11362 ) * Update Consul Connect to Consul service mesh * Apply suggestions from code review	2021-10-26 15:10:21 -04:00
Luiz Aoqui	8c799b3980	add dispatch idempotency token support in the CLI (#10930 )	2021-10-22 12:39:05 -04:00
Luiz Aoqui	82a3ae7b40	cli: allow setting namespace and region in the `nomad ui` command (#11364 )	2021-10-21 16:24:39 -04:00
Michael Schurter	37f053ff89	client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!	2021-10-18 09:22:01 -07:00
Andy Assareh	305cf571d4	exactly one of ingress, terminating, or mesh must be configured i believe mesh should be included in this statement was omitted.	2021-10-15 14:15:02 -07:00
Shishir Mahajan	479442e682	Add support for --init to docker driver. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2021-10-15 12:53:25 -07:00
Luiz Aoqui	681eeca515	docs: update Nvidia device plugin as external (#11313 )	2021-10-14 12:22:31 -04:00
Michael Schurter	6a0dede9b6	Merge pull request #11167 from a-zagaevskiy/master Support configurable dynamic port range	2021-10-13 16:47:38 -07:00
Jorge Marey	833247600b	Add os-nova nomad autoscaler repo link (#11277 )	2021-10-12 17:04:58 -04:00
Dave May	f545ac1bc4	cli: Add nomad job allocs command (#11242 )	2021-10-12 16:30:36 -04:00
Matt Mukerjee	0881b94201	Add FailoverHeartbeatTTL to config (#11127 ) FailoverHeartbeatTTL is the amount of time to wait after a server leader failure before considering reallocating client tasks. This TTL should be fairly long as the new server leader needs to rebuild the entire heartbeat map for the cluster. In deployments with a small number of machines, the default TTL (5m) may be unnecessary long. Let's allow operators to configure this value in their config files.	2021-10-06 18:48:12 -04:00
Amit Shuster	215bf04bc6	Lightrun Integration - External task driver (#11203 )	2021-10-06 15:34:34 -04:00
Yan	c21493a560	add `-show-url` option for `ui` command (#11213 )	2021-10-05 20:08:42 -04:00

1 2 3 4 5 ...

292 Commits