nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-02 16:35:44 +03:00

Author	SHA1	Message	Date
Charlie Voiselle	8ba714e211	Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )	2021-10-13 21:23:13 -04:00
Michael Schurter	6a0dede9b6	Merge pull request #11167 from a-zagaevskiy/master Support configurable dynamic port range	2021-10-13 16:47:38 -07:00
Michael Schurter	fc89835daf	client: improve errors & tests for dynamic ports	2021-10-13 16:25:25 -07:00
Luiz Aoqui	713094ffb7	wrap `log` messages with `hclog` (#11291 )	2021-10-12 14:38:44 -04:00
Aleksandr Zagaevskiy	0620bb04a5	fixup! Support configurable dynamic port range	2021-10-11 14:13:59 +03:00
Matt Mukerjee	0881b94201	Add FailoverHeartbeatTTL to config (#11127 ) FailoverHeartbeatTTL is the amount of time to wait after a server leader failure before considering reallocating client tasks. This TTL should be fairly long as the new server leader needs to rebuild the entire heartbeat map for the cluster. In deployments with a small number of machines, the default TTL (5m) may be unnecessary long. Let's allow operators to configure this value in their config files.	2021-10-06 18:48:12 -04:00
Luiz Aoqui	6cd62b3ef0	fix panic when Connect mesh gateway doesn't have a proxy block (#11257 ) Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2021-10-04 15:52:07 -04:00
Mahmood Ali	6c414cd5f9	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Michael Schurter	2968c01295	client: output reserved ports with min/max ports Also add a little more min/max port testing and add the consts back that had been removed: but unexported and as defaults.	2021-09-30 17:05:46 -07:00
Michael Schurter	33c91fd734	client: add NOMAD_LICENSE to default env deny list By default we should not expose the NOMAD_LICENSE environment variable to tasks. Also refactor where the DefaultEnvDenyList lives so we don't have to maintain 2 copies of it. Since client/config is the most obvious location, keep a reference there to its unfortunate home buried deep in command/agent/host. Since the agent uses this list as well for the /agent/host endpoint the list must be accessible from both command/agent and client.	2021-09-21 13:51:17 -07:00
James Rasell	e34fa583f9	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
Aleksandr Zagaevskiy	e3b6f62198	Support configurable dynamic port range	2021-09-10 11:52:47 +03:00
James Rasell	3bffe443ac	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Luiz Aoqui	d74ab11d25	Don't timestamp active log file (#11070 ) * don't timestamp active log file * website: update log_file default value * changelog: add entry for #11070 * website: add upgrade instructions for log_file in v1.14 and v1.2.0	2021-08-23 11:27:34 -04:00
Mahmood Ali	fdb8684004	Merge pull request #9160 from hashicorp/f-sysbatch core: implement system batch scheduler	2021-08-16 09:30:24 -04:00
Michael Schurter	734f1d7eb6	Merge pull request #10848 from ggriffiths/listsnapshot_secrets CSI Listsnapshot secrets support	2021-08-10 15:59:33 -07:00
Seth Hoenig	61ee443ee6	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Mahmood Ali	52c37e16aa	Only initialize task.VolumeMounts when not-nil (#10990 ) 1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning. The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally. Fixes #10981	2021-08-02 13:08:10 -04:00
Nomad Release bot	8c0c814099	Generate files for 1.1.3 release	2021-07-29 03:43:03 +00:00
Grant Griffiths	cba476eae6	CSI ListSnapshots secrets implementation Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>	2021-07-28 11:30:29 -07:00
Mahmood Ali	428174ab55	Merge pull request #10875 from hashicorp/b-namespace-flag-override cli: `-namespace` should override job namespace	2021-07-14 17:28:36 -04:00
Seth Hoenig	8ad8e1095a	consul/connect: remove sidecar proxy before removing parent service This PR will have Nomad de-register a sidecar proxy service before attempting to de-register the parent service. Otherwise, Consul will emit a warning and an error. Fixes #10845	2021-07-08 13:30:19 -05:00
Seth Hoenig	72f431f4c4	Merge pull request #10872 from hashicorp/b-cc-regex-checkids consul/connect: Avoid assumption of parent service when filtering connect proxies	2021-07-08 13:29:40 -05:00
Seth Hoenig	82f424585a	consul/connect: improve regex from CR suggestions	2021-07-08 13:05:05 -05:00
Tim Gross	38af39e1b2	cli: `-namespace` should override job namespace When a jobspec doesn't include a namespace, we provide it with the default namespace, but this ends up overriding the explicit `-namespace` flag. This changeset uses the same logic as region parsing to create an order of precedence: the query string parameter (the `-namespace` flag) overrides the API request body which overrides the jobspec.	2021-07-08 13:17:27 -04:00
Seth Hoenig	86283ad2dc	consul/connect: Avoid assumption of parent service when filtering connect proxies This PR uses regex-based matching for sidecar proxy services and checks when syncing with Consul. Previously we would check if the parent of the sidecar was still being tracked in Nomad. This is a false invariant - one which we must not depend when we make #10845 work. Fixes #10843	2021-07-08 09:43:41 -05:00
Mahmood Ali	0f0bcdb6ed	Merge pull request #10806 from hashicorp/munda/idempotent-job-dispatch Enforce idempotency of dispatched jobs using token on dispatch request	2021-07-08 10:23:31 -04:00
Tim Gross	b397edff4e	cni: respect default `cni_config_dir` and `cni_path` (#10870 ) The default agent configuration values were not set, which meant they were not being set in the client configuration and this results in fingerprints failing unless the values were set explicitly.	2021-07-08 09:56:57 -04:00
Alex Munda	2bd2f586f4	Set/parse idempotency_token query param	2021-07-07 16:26:55 -05:00
Seth Hoenig	421a6a8a7e	consul: avoid extra sync operations when no action required This PR makes it so the Consul sync logic will ignore operations that do not specify an action to take (i.e. [de-]register [services\|checks]). Ideally such noops would be discarded at the callsites (i.e. users of [Create\|Update\|Remove]Workload], but we can also be defensive at the commit point. Also adds 2 trace logging statements which are helpful for diagnosing sync operations with Consul - when they happen and why. Fixes #10797	2021-07-07 11:24:56 -05:00
Tim Gross	550fca9f83	csi: account for nil volume_mount in API-to-structs conversion (#10855 ) Fix a nil pointer in the API struct to `nomad/structs` conversion when a `volume_mount` block is empty.	2021-07-07 08:06:39 -04:00
Seth Hoenig	57fdb81433	consul: set task name only for group service checks This PR fixes a bug introduced in a refactoring https://github.com/hashicorp/nomad/pull/10764/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ec where task level service checks would inherent the task name field, when they shouldn't. Fixes #10781	2021-06-18 12:16:27 -05:00
Seth Hoenig	7ba60b4e33	consul/connect: in-place update service definition when connect upstreams are modified This PR fixes a bug where modifying the upstreams of a Connect sidecar proxy would not result Consul applying the changes, unless an additional change to the job would trigger a task replacement (thus replacing the service definition). The fix is to check if upstreams have been modified between Nomad's view of the sidecar service definition, and the service definition for the sidecar that is actually registered in Consul. Fixes #8754	2021-06-16 16:48:26 -05:00
Seth Hoenig	b4a631c1c5	consul: make failures_before_critical and success_before_passing work with group services This PR fixes some job submission plumbing to make sure the Consul Check parameters - failure_before_critical - success_before_passing work with group-level services. They already work with task-level services.	2021-06-15 11:20:40 -05:00
James Rasell	a7d055a584	Merge pull request #10744 from hashicorp/b-remove-duplicate-imports chore: remove duplicate import statements	2021-06-11 16:42:34 +02:00
James Rasell	a4156c3e94	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
Nomad Release bot	21465a592d	Generate files for 1.1.1 release	2021-06-10 08:04:25 -04:00
Mahmood Ali	122a4cb844	tests: use standard library testing.TB Glint pulled in an updated version of mitchellh/go-testing-interface which broke some existing tests because the update added a Parallel() method to testing.T. This switches to the standard library testing.TB which doesn't have a Parallel() method.	2021-06-09 16:18:45 -07:00
Isabel Suchanek	0edda116ad	cli: add monitor flag to deployment status Adding '-verbose' will print out the allocation information for the deployment. This also changes the job run command so that it now blocks until deployment is complete and adds timestamps to the output so that it's more in line with the output of node drain. This uses glint to print in place in running in a tty. Because glint doesn't yet support cmd/powershell, Windows workflows use a different library to print in place, which results in slightly different formatting: 1) different margins, and 2) no spinner indicating deployment in progress.	2021-06-09 16:18:45 -07:00
Seth Hoenig	9883b95ae4	consul: move consul acl tests into ent files (cherry-pick ent back to oss) This PR moves a lot of Consul ACL token validation tests into ent files, so that we can verify correct behavior difference between OSS and ENT Nomad versions.	2021-06-09 08:38:42 -05:00
Seth Hoenig	402b19c3b0	Merge pull request #10720 from hashicorp/f-cns-acl-check consul: correctly check consul acl token namespace when using consul oss	2021-06-08 15:43:42 -05:00
Seth Hoenig	09c9a17a7f	consul: correctly check consul acl token namespace when using consul oss This PR fixes the Nomad Object Namespace <-> Consul ACL Token relationship check when using Consul OSS (or Consul ENT without namespace support). Nomad v1.1.0 introduced a regression where Nomad would fail the validation when submitting Connect jobs and allow_unauthenticated set to true, with Consul OSS - because it would do the namespace check against the Consul ACL token assuming the "default" namespace, which does not work because Consul OSS does not have namespaces. Instead of making the bad assumption, expand the namespace check to handle each special case explicitly. Fixes #10718	2021-06-08 13:55:57 -05:00
Seth Hoenig	4b3ed53511	consul: pr cleanup namespace probe function signatures	2021-06-07 15:41:01 -05:00
Seth Hoenig	0bc8a33084	consul: probe consul namespace feature before using namespace api This PR changes Nomad's wrapper around the Consul NamespaceAPI so that it will detect if the Consul Namespaces feature is enabled before making a request to the Namespaces API. Namespaces are not enabled in Consul OSS, and require a suitable license to be used with Consul ENT. Previously Nomad would check for a 404 status code when makeing a request to the Namespaces API to "detect" if Consul OSS was being used. This does not work for Consul ENT with Namespaces disabled, which returns a 500. Now we avoid requesting the namespace API altogether if Consul is detected to be the OSS sku, or if the Namespaces feature is not licensed. Since Consul can be upgraded from OSS to ENT, or a new license applied, we cache the value for 1 minute, refreshing on demand if expired. Fixes https://github.com/hashicorp/nomad-enterprise/issues/575 Note that the ticket originally describes using attributes from https://github.com/hashicorp/nomad/issues/10688. This turns out not to be possible due to a chicken-egg situation between bootstrapping the agent and setting up the consul client. Also fun: the Consul fingerprinter creates its own Consul client, because there is no [currently] no way to pass the agent's client through the fingerprint factory.	2021-06-07 12:19:25 -05:00
Jasmine Dahilig	bdf2555b38	deployment query rate limit (#10706 )	2021-06-04 12:38:46 -07:00
Seth Hoenig	312161c5fc	consul/connect: add support for connect mesh gateways This PR implements first-class support for Nomad running Consul Connect Mesh Gateways. Mesh gateways enable services in the Connect mesh to make cross-DC connections via gateways, where each datacenter may not have full node interconnectivity. Consul docs with more information: https://www.consul.io/docs/connect/gateways/mesh-gateway The following group level service block can be used to establish a Connect mesh gateway. service { connect { gateway { mesh { // no configuration } } } } Services can make use of a mesh gateway by configuring so in their upstream blocks, e.g. service { connect { sidecar_service { proxy { upstreams { destination_name = "<service>" local_bind_port = <port> datacenter = "<datacenter>" mesh_gateway { mode = "<mode>" } } } } } } Typical use of a mesh gateway is to create a bridge between datacenters. A mesh gateway should then be configured with a service port that is mapped from a host_network configured on a WAN interface in Nomad agent config, e.g. client { host_network "public" { interface = "eth1" } } Create a port mapping in the group.network block for use by the mesh gateway service from the public host_network, e.g. network { mode = "bridge" port "mesh_wan" { host_network = "public" } } Use this port label for the service.port of the mesh gateway, e.g. service { name = "mesh-gateway" port = "mesh_wan" connect { gateway { mesh {} } } } Currently Envoy is the only supported gateway implementation in Consul. By default Nomad client will run the latest official Envoy docker image supported by the local Consul agent. The Envoy task can be customized by setting `meta.connect.gateway_image` in agent config or by setting the `connect.sidecar_task` block. Gateways require Consul 1.8.0+, enforced by the Nomad scheduler. Closes #9446	2021-06-04 08:24:49 -05:00
Mahmood Ali	ab4b42f4f4	exec: http: close websocket connection gracefully In this loop, we ought to close the websocket connection gracefully when the StreamErrWrapper reaches EOF. Previously, it's possible that that we drop the last few events or skip sending the websocket closure. If `handler(handlerPipe)` returns and `cancel` is called, before the loop here completes processing streaming events, the loop exits prematurely without propagating the last few events. Instead here, the loop continues until we hit `httpPipe` EOF (through `decoder.Decode`), to ensure we process the events to completion.	2021-05-24 13:37:23 -04:00
Tim Gross	4f57e06957	agent: surface websocket errors in logs The websocket interface used for `alloc exec` has to silently drop client send errors because otherwise those errors would interleave with the streamed output. But we may be able to surface errors that cause terminated websockets a little better in the HTTP server logs.	2021-05-24 09:46:45 -04:00
Nomad Release bot	c16a285054	Generate files for 1.1.0-rc1 release	2021-05-12 22:43:48 +00:00
Chris Baker	140e7b3aaa	Node Drain Metadata (#10250 )	2021-05-07 13:58:40 -04:00

1 2 3 4 5 ...

1863 Commits