Commit Graph

67 Commits

Author SHA1 Message Date
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Preetha Appan
3cf22d2903 Pass service metadata "external-source" for consul UI integration 2018-11-16 11:28:56 -06:00
Alex Dadgar
32f9da9e07 small fixes 2018-09-15 16:42:38 -07:00
Alex Dadgar
40d095fd1a agent + consul 2018-09-13 10:43:40 -07:00
Preetha Appan
a7668cd4ec Fix tests and move isClient to constructor 2018-06-01 15:59:53 -05:00
Preetha Appan
e27caadca6 Fix unnecessary deregistration in consul sync
This commit fixes an issue where if a nomad client and server shared the same consul instance, the server would deregister any services and checks registered by clients for running tasks.
2018-06-01 14:48:25 -05:00
Alex Dadgar
796e623dde Use Tags when CanaryTags isn't specified
This PR fixes a bug where we weren't defaulting to `tags` when
`canary_tags` was empty and adds documentation.
2018-05-23 13:07:47 -07:00
Michael Schurter
17c6eb8629 consul: support canary tags for services
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.

Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Michael Schurter
905bef8f2d consul: make grpc checks more like http checks 2018-05-04 11:08:11 -07:00
Michael Schurter
93356e7d70 consul: initial grpc implementation
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter
972e861410 consul: periodically reconcile services/checks
Periodically sync services and checks from Nomad to Consul. This is
mostly useful when testing with the Consul dev agent which does not
persist state across restarts. However, this is a reasonable safety
measure to prevent skew between Consul's state and Nomad's
services+checks.

Also modernized the test suite a bit.
2018-04-19 15:45:42 -07:00
Michael Schurter
9f50ab334c Replace Consul TLSSkipVerify handling
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.

The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Josh Soref
aa4bcb2cc9 spelling: supports 2018-03-11 19:00:11 +00:00
Josh Soref
41084f6d70 spelling: registrations 2018-03-11 18:40:53 +00:00
Josh Soref
84277e6d6a spelling: otherwise 2018-03-11 18:34:27 +00:00
Josh Soref
15305a29b3 spelling: asynchronously 2018-03-11 17:41:50 +00:00
Michael Schurter
7ae2291572 Improve invalid port error message for services
Related to #3681

If a user specifies an invalid port *label* when using
address_mode=driver they'll get an error message about the label being
an invalid number which is very confusing.

I also added a bunch of testing around Service.AddressMode validation
since I was concerned by the linked issue that there were cases I was
missing. Unfortunately when address_mode=driver is used there's only so
much validation that can be done as structs/structs.go validation never
peeks into the driver config which would be needed to verify the port
labels/map.
2018-01-18 15:35:24 -08:00
Michael Schurter
cde796162c Always advertise driver IP when in driver mode
Fixes #3681

When in drive address mode Nomad should always advertise the driver's IP
in Consul even when no network exists. This matches the 0.6 behavior.

When in host address mode Nomad advertises the alloc's network's IP if
one exists. Otherwise it lets Consul determine the IP.

I also added some much needed logging around Docker's network discovery.
2018-01-18 15:35:24 -08:00
Michael Schurter
a4449c84d7 Services should not require a port
Fixes #3673
2017-12-19 15:50:23 -08:00
Michael Schurter
d0a1ca7ee8 Use the Service.Hash() method in agent service ids
The allocID and taskName parameters are useless for agents, but it's
still nice to reuse the same hash method for agent and task services.
This brings in the lowercase mode for the agent hash as well.
2017-12-11 16:50:15 -08:00
Michael Schurter
6f124eba7d Be more defensive in port checks 2017-12-08 12:27:57 -08:00
Michael Schurter
afd5bca566 Move service hash logic to Service.Hash method 2017-12-08 12:03:43 -08:00
Michael Schurter
65bfbe54c8 Hash fields used in task service IDs
Fixes #3620

Previously we concatenated tags into task service IDs. This could break
deregistration of tag names that contained double //s like some Fabio
tags.

This change breaks service ID backward compatibility so on upgrade all
users services and checks will be removed and re-added with new IDs.

This change has the side effect of including all service fields in the
ID's hash, so we no longer have to track PortLabel and AddressMode
changes independently.
2017-12-08 12:03:43 -08:00
Michael Schurter
fa5faa5a09 Prevent using port 0 with address_mode=driver 2017-12-08 12:03:43 -08:00
Michael Schurter
b999358892 Validate port label for host address mode
Also skip getting an address for script checks which don't use them.

Fixed a weird invalid reserved port in a TaskRunner test helper as well
as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause
other tests to fail, but we were referencing an invalid PortLabel and
just not catching it before.
2017-12-08 12:03:43 -08:00
Michael Schurter
25569282b9 Allow custom ports for services and checks
Fixes #3380

Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.

This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
2017-12-08 12:03:00 -08:00
Jens Herrmann
0fe9b07ac2 Fix typos in metric names. #3610 2017-12-01 15:24:14 +01:00
Michael Schurter
a52cb34ab8 Don't set Interval on TTL health checks 2017-10-16 17:35:47 -07:00
Alex Dadgar
a9e3a41407 Enable more linters 2017-09-26 15:26:33 -07:00
Michael Schurter
5cd1d57218 Watched -> TriggersRestart
Watched was a silly name
2017-09-14 16:48:39 -07:00
Michael Schurter
1608e59415 Add check watcher for restarting unhealthy tasks 2017-09-14 16:46:54 -07:00
Michael Schurter
947516405a Add Header and Method support for HTTP checks 2017-08-17 16:44:21 -07:00
Alex Dadgar
f49c026438 Merge pull request #2984 from hashicorp/b-tags
Fix alloc health with checks using interpolation
2017-08-10 13:07:25 -07:00
Alex Dadgar
4d323e14df Address comments 2017-08-10 13:07:08 -07:00
Alex Dadgar
3b300925a2 Fix alloc health with checks using interpolation
Fixes an issue in which the allocation health watcher was checking for
allocations health based on un-interpolated services and checks. Change
the interface for retrieving check information from Consul to retrieving
all registered services and checks by allocation. In the future this
will allow us to output nicer messages.

Fixes https://github.com/hashicorp/nomad/issues/2969
2017-08-07 16:27:08 -07:00
Luke Farnell
7a56971508 fixed all spelling mistakes for goreport 2017-08-07 17:13:05 -04:00
Michael Schurter
c075349a33 Use int32 for atomic ops to avoid alignment issues
From https://golang.org/pkg/sync/atomic/#pkg-note-BUG :

On both ARM and x86-32, it is the caller's responsibility to arrange for
64-bit alignment of 64-bit words accessed atomically. The first word in
a global variable or in an allocated struct or slice can be relied upon
to be 64-bit aligned.
2017-08-04 10:14:16 -07:00
Michael Schurter
e04267da4f Fix comment 2017-07-25 12:13:05 -07:00
Michael Schurter
4666d85a29 Use seen more conservatively 2017-07-24 16:48:40 -07:00
Michael Schurter
8e5d95df08 Always increment failures...
...as it's used in calculating the backoff
2017-07-24 15:37:53 -07:00
Michael Schurter
82ea86fb6f Track whether Consul has ever been seen
Need a way to squelch Consul operation errors on shutdown. If it's never
been seen don't log errors about deregs failing.
2017-07-24 12:12:02 -07:00
Michael Schurter
b01a00a7f3 Synchronously deregister agent on shutdown
Fixes #2891

Previously the agent services and checks were being asynchrously
deregistered on shutdown, so it was a race between the sync goroutine
deregistering them and Nomad shutting down.

This switches to synchronously deregister agent serivces and checks
which doesn't really have a downside since the sync goroutines retry
behavior doesn't help on shutdown anyway.
2017-07-24 11:40:37 -07:00
Michael Schurter
a83d6f996e Never remove unknown agent services
Fixes #2827

This is a tradeoff. The pro is that you can run separate client and
server agents on the same node and advertise both. The con is that if a
Nomad agent crashes and isn't restarted on that node in the same mode
its entry will not be cleaned up.

That con scenario seems far less likely to occur than the scenario on
the pro side, and even if we do leak an agent entry the checks will be
failing so nothing should attempt to use it.
2017-07-18 13:23:01 -07:00
Alex Dadgar
38e50eedb1 check id method name changed 2017-07-07 12:15:09 -07:00
Alex Dadgar
f72bbaa370 Client watches for allocation health using task state and Consul checks
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
2017-07-07 12:10:04 -07:00
Michael Schurter
a464ca3287 Remove debug logging 2017-06-21 17:19:08 -07:00
Michael Schurter
4117c8b3a2 Fix Service.AddressMode changes during task updates 2017-06-21 17:19:08 -07:00
Michael Schurter
8a7df57227 Test driver network advertisement and checks 2017-06-21 17:19:08 -07:00
Michael Schurter
3fddb05fc8 Implement DriverNetwork and Service.AddressMode
Ideally DriverNetwork would be fully populated in Driver.Prestart, but
Docker doesn't assign the container's IP until you start the container.

However, it's important to setup the port env vars before calling
Driver.Start, so Prestart should populate that.
2017-06-21 17:19:08 -07:00
Michael Schurter
92535496b4 Properly interpolate services on updated tasks
Previously was interpolating the original task's services again.

Fixes #2180

Also fixes a slight memory leak in the new consul agent. Script check
handles weren't being deleted after cancellation.
2017-04-26 11:22:01 -07:00