Commit Graph

293 Commits

Author SHA1 Message Date
James Rasell
e34fa583f9 allow configuration of Docker hostnames in bridge mode (#11173)
Add a new hostname string parameter to the network block which
allows operators to specify the hostname of the network namespace.
Changing this causes a destructive update to the allocation and it
is omitted if empty from API responses. This parameter also supports
interpolation.

In order to have a hostname passed as a configuration param when
creating an allocation network, the CreateNetwork func of the
DriverNetworkManager interface needs to be updated. In order to
minimize the disruption of future changes, rather than add another
string func arg, the function now accepts a request struct along with
the allocID param. The struct has the hostname as a field.

The in-tree implementations of DriverNetworkManager.CreateNetwork
have been modified to account for the function signature change.
In updating for the change, the enhancement of adding hostnames to
network namespaces has also been added to the Docker driver, whilst
the default Linux manager does not current implement it.
2021-09-16 08:13:09 +02:00
Mahmood Ali
a4e9c11ef9 hcl: add new cores resources field 2021-07-26 11:45:10 -04:00
Mahmood Ali
09903f4454 support gateway mesh 2021-07-22 17:10:59 -04:00
Mahmood Ali
f5069abf02 add a test for envoy_dns_discovery_type 2021-07-22 16:06:48 -04:00
Mahmood Ali
f352490687 parse service check body 2021-07-22 16:06:48 -04:00
Mahmood Ali
dd6ee3f27b services OnUpdate 2021-07-22 16:06:48 -04:00
Mahmood Ali
25b480f7b9 hclv1: parse service upstreams 2021-07-22 16:06:48 -04:00
Mahmood Ali
8ab37d691f test parsing more service fields 2021-07-22 16:06:26 -04:00
Holt Wilkins
898c6c59a6 Enable parsing of terminating gateways 2021-06-30 05:34:16 +00:00
Mahmood Ali
d8e40600f6 Support disabling TCP checks for connect sidecar services 2021-05-07 12:10:26 -04:00
Mahmood Ali
18e1a599ed oversubscription: add memory_max to hclv1
Allow specifying the `memory_max` field in HCL under the resources block.

Though HCLv1 is deprecated, I've updated them to ease our testing.
2021-03-30 16:55:58 -04:00
Seth Hoenig
3621f38667 Merge pull request #10075 from mr-karan/fix/parse_service
fix: parse_service: return err instead of panic
2021-02-24 10:14:55 -06:00
Karan Sharma
1d2ba0ec5a fix: parse_service: return err instead of panic
Fixes https://github.com/hashicorp/nomad/issues/10072
2021-02-23 22:15:11 +05:30
Drew Bailey
24c0e3ccf5 address pr comments 2021-02-08 13:43:05 -05:00
Drew Bailey
74e7bbb7d2 e2e test for on_update service checks
check_restart not compatible with on_update=ignore

reword caveat
2021-02-08 08:32:40 -05:00
Drew Bailey
5c30148334 OnUpdate configuration for services and checks
Allow for readiness type checks by configuring nomad to ignore warnings
or errors reported by a service check. This allows the deployment to
progress and while Consul handles introducing the sercive into a
resource pool once the check passes.
2021-02-08 08:32:40 -05:00
Kris Hicks
85ed8ddd4f Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Kris Hicks
071f4c7596 Add gocritic to golangci-lint config (#9556) 2020-12-08 12:47:04 -08:00
Chris Baker
9e2eadc7e2 added new policy capabilities for recommendations API
state store: call-out to generic update of job recommendations from job update method
recommendations API work, and http endpoint errors for OSS
support for scaling polices in task block of job spec
add query filters for ScalingPolicy list endpoint
command: nomad scaling policy list: added -job and -type
2020-10-28 14:32:16 +00:00
Mahmood Ali
1f949100db hclv2 tests: test complex config configuration 2020-10-26 16:20:41 -04:00
Mahmood Ali
2a9232902c hclv1: tweak HCLv1 tests
This ensures that gatway ReadOnly key is tested.  Also, update the hclv1
test-fixtures to be hclv1 compliant.
2020-10-21 14:05:46 -04:00
Mahmood Ali
922b671719 api: parse service gateway name
Adding gateway name eases HCLv2 parsing. This field is only used for parsing the
job and is ignored for any other pruposes
2020-10-21 14:05:46 -04:00
Mahmood Ali
9fe74c90f7 Isolate the jobspec package from the rest of Nomad (#8815)
This eases adoption of the jobspec package by other projects (e.g. terraform nomad provider, Lavant). Either by consuming directy as a library (hopefully without having go mod import rest of nomad) or by copying the package without modification.

Ideally, this package will be published as an independent module. We aren't ready for that considering we'll be switching to HCLv2 "soon", but eitherway, this seems like a reasonable intermediate step if we choose to.
2020-09-03 06:34:04 -05:00
Seth Hoenig
36a743f19d consul/connect: remove envoy dns option from gateway proxy config 2020-08-24 09:11:55 -05:00
Seth Hoenig
9ffdeed904 consul/connect: add initial support for ingress gateways
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.

```hcl
service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      ingress {
        // ingress-gateway configuration entry
      }
    }
  }
}
```

A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.

Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.

Aims to address #8294 and tangentially #8647
2020-08-21 16:21:54 -05:00
Seth Hoenig
6d7476cd3d consul: grrrrr hclfmt test resource file 2020-08-10 14:08:09 -05:00
Seth Hoenig
e664f9b69a consul: able to set pass/fail thresholds on consul service checks
This change adds the ability to set the fields `success_before_passing` and
`failures_before_critical` on Consul service check definitions. This is a
feature added to Consul v1.7.0 and later.
  https://www.consul.io/docs/agent/checks#success-failures-before-passing-critical

Nomad doesn't do much besides pass the fields through to Consul.

Fixes #6913
2020-08-10 14:08:09 -05:00
Drew Bailey
19810365f6 oss compoments for multi-vault namespaces
adds in oss components to support enterprise multi-vault namespace feature

upgrade specific doc on vault multi-namespaces

vault docs

update test to reflect new error
2020-07-24 10:14:59 -04:00
Chris Baker
b5f595a054 better testing of scaling parsing, fixed some broken tests by api
changes
2020-07-04 19:32:37 +00:00
Chris Baker
7f8176a188 changes to make sure that Max is present and valid, to improve error messages
* made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected
* added checks to HCL parsing that it is present
* when Scaling.Max is absent/invalid, don't return extraneous error messages during validation
* tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs

resolves #8355
2020-07-04 19:05:50 +00:00
Seth Hoenig
ef52328eb4 connect/native: doc and comment tweaks from PR 2020-06-24 10:13:22 -05:00
Seth Hoenig
520d35e085 consul/connect: split connect native flag and task in service 2020-06-23 10:22:22 -05:00
Seth Hoenig
7e8d5c2392 consul/connect: add support for running connect native tasks
This PR adds the capability of running Connect Native Tasks on Nomad,
particularly when TLS and ACLs are enabled on Consul.

The `connect` stanza now includes a `native` parameter, which can be
set to the name of task that backs the Connect Native Consul service.

There is a new Client configuration parameter for the `consul` stanza
called `share_ssl`. Like `allow_unauthenticated` the default value is
true, but recommended to be disabled in production environments. When
enabled, the Nomad Client's Consul TLS information is shared with
Connect Native tasks through the normal Consul environment variables.
This does NOT include auth or token information.

If Consul ACLs are enabled, Service Identity Tokens are automatically
and injected into the Connect Native task through the CONSUL_HTTP_TOKEN
environment variable.

Any of the automatically set environment variables can be overridden by
the Connect Native task using the `env` stanza.

Fixes #6083
2020-06-22 14:07:44 -05:00
Nick Ethier
ad8ced3873 multi-interface network support 2020-06-19 09:42:10 -04:00
Nick Ethier
e9ff8a8daa Task DNS Options (#7661)
Co-Authored-By: Tim Gross <tgross@hashicorp.com>
Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>
2020-06-18 11:01:31 -07:00
Tim Gross
45c2e875f8 multiregion: change AutoRevert to OnFailure 2020-06-17 11:05:45 -04:00
Drew Bailey
ce8f230cab Multiregion deploy status and job status CLI 2020-06-17 11:03:34 -04:00
Tim Gross
f64f5a645c Multiregion structs
Initial struct definitions, jobspec parsing, validation, and conversion
between Nomad structs and API structs for multi-region deployments.
2020-06-17 11:00:14 -04:00
Seth Hoenig
051e387d0c build: use hashicorp hclfmt
We have been using fatih/hclfmt which is long abandoned. Instead, switch
to HashiCorp's own hclfmt implementation. There are some trivial changes in
behavior around whitespace.
2020-05-24 18:31:57 -05:00
Lang Martin
cd6d34425f server: stop after client disconnect (#7939)
* jobspec, api: add stop_after_client_disconnect

* nomad/state/state_store: error message typo

* structs: alloc methods to support stop_after_client_disconnect

1. a global AllocStates to track status changes with timestamps. We
   need this to track the time at which the alloc became lost
   originally.

2. ShouldClientStop() and WaitClientStop() to actually do the math

* scheduler/reconcile_util: delayByStopAfterClientDisconnect

* scheduler/reconcile: use delayByStopAfterClientDisconnect

* scheduler/util: updateNonTerminalAllocsToLost comments

This was setup to only update allocs to lost if the DesiredStatus had
already been set by the scheduler. It seems like the intention was to
update the status from any non-terminal state, and not all lost allocs
have been marked stop or evict by now

* scheduler/testing: AssertEvalStatus just use require

* scheduler/generic_sched: don't create a blocked eval if delayed

* scheduler/generic_sched_test: several scheduling cases
2020-05-13 16:39:04 -04:00
Tim Gross
8af65c5707 ci: add a linting check for HCL files (#7791)
Running `make dev` runs `hclfmt`, but this isn't checked as part of
CI. That makes it possible to merge un-formatted HCL and Nomad
jobspecs that later will make for dirty git staging areas when
developers pull master.

This changeset adds HCL linting to the `make check` target.
2020-04-23 14:32:44 -04:00
Chris Baker
392ec3e697 return parsing error if scaling policy includes more than one policy block
also, check that parsing a minimal scaling block doesn't throw any errors
2020-04-23 12:37:45 +00:00
Seth Hoenig
a6cb4a04e5 Merge pull request #7690 from hashicorp/b-inspect-proxy-output
two fixes for inspect on connect proxy
2020-04-20 10:17:54 -06:00
Anthony Scalisi
e1287846ae fix spelling errors (#6985) 2020-04-20 09:28:19 -04:00
Seth Hoenig
dc11226763 jobspec: correctly parse proxy fields from jobspec
Before, the proxy stanza did not parse non-object fields
`local_service_port` and `local_service_address` from the
connect `proxy` stanza. This change fixes that.
2020-04-13 15:59:45 -06:00
Seth Hoenig
0eb2844900 connect: extract common task keys 2020-04-10 09:49:19 -06:00
Seth Hoenig
729931aced connect: enable configuring sidecar_task.name
Before, the submitted jobspec for sidecar_task would pass
through 2 key validation steps - once for the subset specific
to connect sidecar task definitions, and once again for the set
of normal task definition where the task would actually get
unmarshalled.

The valid keys for the normal task definition did not include
"name", which is supposed to be configurable for the sidecar
task. To fix this, just eliminate the double validation step,
and instead pass-in the correct set of keys to validate against
to the one generic task parser.

Fixes #7680
2020-04-09 21:01:16 -06:00
Tim Gross
245a4c067a hclfmt test fixtures (#7584) 2020-04-01 10:48:28 -04:00
Seth Hoenig
e63f13a0da connect: enable automatic expose paths for individual group service checks
Part of #6120

Building on the support for enabling connect proxy paths in #7323, this change
adds the ability to configure the 'service.check.expose' flag on group-level
service check definitions for services that are connect-enabled. This is a slight
deviation from the "magic" that Consul provides. With Consul, the 'expose' flag
exists on the connect.proxy stanza, which will then auto-generate expose paths
for every HTTP and gRPC service check associated with that connect-enabled
service.

A first attempt at providing similar magic for Nomad's Consul Connect integration
followed that pattern exactly, as seen in #7396. However, on reviewing the PR
we realized having the `expose` flag on the proxy stanza inseperably ties together
the automatic path generation with every HTTP/gRPC defined on the service. This
makes sense in Consul's context, because a service definition is reasonably
associated with a single "task". With Nomad's group level service definitions
however, there is a reasonable expectation that a service definition is more
abstractly representative of multiple services within the task group. In this
case, one would want to define checks of that service which concretely make HTTP
or gRPC requests to different underlying tasks. Such a model is not possible
with the course `proxy.expose` flag.

Instead, we now have the flag made available within the check definitions themselves.
By making the expose feature resolute to each check, it is possible to have
some HTTP/gRPC checks which make use of the envoy exposed paths, as well as
some HTTP/gRPC checks which make use of some orthongonal port-mapping to do
checks on some other task (or even some other bound port of the same task)
within the task group.

Given this example,

group "server-group" {
  network {
    mode = "bridge"
    port "forchecks" {
      to = -1
    }
  }

  service {
    name = "myserver"
    port = 2000

    connect {
      sidecar_service {
      }
    }

    check {
      name     = "mycheck-myserver"
      type     = "http"
      port     = "forchecks"
      interval = "3s"
      timeout  = "2s"
      method   = "GET"
      path     = "/classic/responder/health"
      expose   = true
    }
  }
}

Nomad will automatically inject (via job endpoint mutator) the
extrapolated expose path configuration, i.e.

expose {
  path {
    path            = "/classic/responder/health"
    protocol        = "http"
    local_path_port = 2000
    listener_port   = "forchecks"
  }
}

Documentation is coming in #7440 (needs updating, doing next)

Modifications to the `countdash` examples in https://github.com/hashicorp/demo-consul-101/pull/6
which will make the examples in the documentation actually runnable.

Will add some e2e tests based on the above when it becomes available.
2020-03-31 17:15:50 -06:00
Seth Hoenig
ee3b43e6c0 jobspec: parse multi expose.path instead of explicit slice 2020-03-31 17:15:27 -06:00