Commit Graph

51 Commits

Author SHA1 Message Date
Michael Schurter
7ae2291572 Improve invalid port error message for services
Related to #3681

If a user specifies an invalid port *label* when using
address_mode=driver they'll get an error message about the label being
an invalid number which is very confusing.

I also added a bunch of testing around Service.AddressMode validation
since I was concerned by the linked issue that there were cases I was
missing. Unfortunately when address_mode=driver is used there's only so
much validation that can be done as structs/structs.go validation never
peeks into the driver config which would be needed to verify the port
labels/map.
2018-01-18 15:35:24 -08:00
Michael Schurter
cde796162c Always advertise driver IP when in driver mode
Fixes #3681

When in drive address mode Nomad should always advertise the driver's IP
in Consul even when no network exists. This matches the 0.6 behavior.

When in host address mode Nomad advertises the alloc's network's IP if
one exists. Otherwise it lets Consul determine the IP.

I also added some much needed logging around Docker's network discovery.
2018-01-18 15:35:24 -08:00
Michael Schurter
a4449c84d7 Services should not require a port
Fixes #3673
2017-12-19 15:50:23 -08:00
Michael Schurter
d0a1ca7ee8 Use the Service.Hash() method in agent service ids
The allocID and taskName parameters are useless for agents, but it's
still nice to reuse the same hash method for agent and task services.
This brings in the lowercase mode for the agent hash as well.
2017-12-11 16:50:15 -08:00
Michael Schurter
6f124eba7d Be more defensive in port checks 2017-12-08 12:27:57 -08:00
Michael Schurter
afd5bca566 Move service hash logic to Service.Hash method 2017-12-08 12:03:43 -08:00
Michael Schurter
65bfbe54c8 Hash fields used in task service IDs
Fixes #3620

Previously we concatenated tags into task service IDs. This could break
deregistration of tag names that contained double //s like some Fabio
tags.

This change breaks service ID backward compatibility so on upgrade all
users services and checks will be removed and re-added with new IDs.

This change has the side effect of including all service fields in the
ID's hash, so we no longer have to track PortLabel and AddressMode
changes independently.
2017-12-08 12:03:43 -08:00
Michael Schurter
fa5faa5a09 Prevent using port 0 with address_mode=driver 2017-12-08 12:03:43 -08:00
Michael Schurter
b999358892 Validate port label for host address mode
Also skip getting an address for script checks which don't use them.

Fixed a weird invalid reserved port in a TaskRunner test helper as well
as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause
other tests to fail, but we were referencing an invalid PortLabel and
just not catching it before.
2017-12-08 12:03:43 -08:00
Michael Schurter
25569282b9 Allow custom ports for services and checks
Fixes #3380

Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.

This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
2017-12-08 12:03:00 -08:00
Jens Herrmann
0fe9b07ac2 Fix typos in metric names. #3610 2017-12-01 15:24:14 +01:00
Michael Schurter
a52cb34ab8 Don't set Interval on TTL health checks 2017-10-16 17:35:47 -07:00
Alex Dadgar
a9e3a41407 Enable more linters 2017-09-26 15:26:33 -07:00
Michael Schurter
5cd1d57218 Watched -> TriggersRestart
Watched was a silly name
2017-09-14 16:48:39 -07:00
Michael Schurter
1608e59415 Add check watcher for restarting unhealthy tasks 2017-09-14 16:46:54 -07:00
Michael Schurter
947516405a Add Header and Method support for HTTP checks 2017-08-17 16:44:21 -07:00
Alex Dadgar
f49c026438 Merge pull request #2984 from hashicorp/b-tags
Fix alloc health with checks using interpolation
2017-08-10 13:07:25 -07:00
Alex Dadgar
4d323e14df Address comments 2017-08-10 13:07:08 -07:00
Alex Dadgar
3b300925a2 Fix alloc health with checks using interpolation
Fixes an issue in which the allocation health watcher was checking for
allocations health based on un-interpolated services and checks. Change
the interface for retrieving check information from Consul to retrieving
all registered services and checks by allocation. In the future this
will allow us to output nicer messages.

Fixes https://github.com/hashicorp/nomad/issues/2969
2017-08-07 16:27:08 -07:00
Luke Farnell
7a56971508 fixed all spelling mistakes for goreport 2017-08-07 17:13:05 -04:00
Michael Schurter
c075349a33 Use int32 for atomic ops to avoid alignment issues
From https://golang.org/pkg/sync/atomic/#pkg-note-BUG :

On both ARM and x86-32, it is the caller's responsibility to arrange for
64-bit alignment of 64-bit words accessed atomically. The first word in
a global variable or in an allocated struct or slice can be relied upon
to be 64-bit aligned.
2017-08-04 10:14:16 -07:00
Michael Schurter
e04267da4f Fix comment 2017-07-25 12:13:05 -07:00
Michael Schurter
4666d85a29 Use seen more conservatively 2017-07-24 16:48:40 -07:00
Michael Schurter
8e5d95df08 Always increment failures...
...as it's used in calculating the backoff
2017-07-24 15:37:53 -07:00
Michael Schurter
82ea86fb6f Track whether Consul has ever been seen
Need a way to squelch Consul operation errors on shutdown. If it's never
been seen don't log errors about deregs failing.
2017-07-24 12:12:02 -07:00
Michael Schurter
b01a00a7f3 Synchronously deregister agent on shutdown
Fixes #2891

Previously the agent services and checks were being asynchrously
deregistered on shutdown, so it was a race between the sync goroutine
deregistering them and Nomad shutting down.

This switches to synchronously deregister agent serivces and checks
which doesn't really have a downside since the sync goroutines retry
behavior doesn't help on shutdown anyway.
2017-07-24 11:40:37 -07:00
Michael Schurter
a83d6f996e Never remove unknown agent services
Fixes #2827

This is a tradeoff. The pro is that you can run separate client and
server agents on the same node and advertise both. The con is that if a
Nomad agent crashes and isn't restarted on that node in the same mode
its entry will not be cleaned up.

That con scenario seems far less likely to occur than the scenario on
the pro side, and even if we do leak an agent entry the checks will be
failing so nothing should attempt to use it.
2017-07-18 13:23:01 -07:00
Alex Dadgar
38e50eedb1 check id method name changed 2017-07-07 12:15:09 -07:00
Alex Dadgar
f72bbaa370 Client watches for allocation health using task state and Consul checks
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
2017-07-07 12:10:04 -07:00
Michael Schurter
a464ca3287 Remove debug logging 2017-06-21 17:19:08 -07:00
Michael Schurter
4117c8b3a2 Fix Service.AddressMode changes during task updates 2017-06-21 17:19:08 -07:00
Michael Schurter
8a7df57227 Test driver network advertisement and checks 2017-06-21 17:19:08 -07:00
Michael Schurter
3fddb05fc8 Implement DriverNetwork and Service.AddressMode
Ideally DriverNetwork would be fully populated in Driver.Prestart, but
Docker doesn't assign the container's IP until you start the container.

However, it's important to setup the port env vars before calling
Driver.Start, so Prestart should populate that.
2017-06-21 17:19:08 -07:00
Michael Schurter
92535496b4 Properly interpolate services on updated tasks
Previously was interpolating the original task's services again.

Fixes #2180

Also fixes a slight memory leak in the new consul agent. Script check
handles weren't being deleted after cancellation.
2017-04-26 11:22:01 -07:00
Michael Schurter
4cf34edb29 Skip checks with TLSSkipVerify if it's unsupported
Fixes #2218
2017-04-19 12:45:34 -07:00
Michael Schurter
346838381b Only register HTTPS agent check when Consul>=0.7.2
Support for TLSSkipVerify in other checks coming soon!
2017-04-19 12:42:48 -07:00
Michael Schurter
f0bec63174 Explain weird timer logic 2017-04-19 12:42:48 -07:00
Michael Schurter
de3d78365e Metricsify new Consul client 2017-04-19 12:42:48 -07:00
Michael Schurter
927b265854 Rework to account for ports not being in IDs
Previous implementation assumed all struct fields were included in
service and check IDs. Service IDs never include port labels and check
IDs *optionally* include port labels, so lots of things had to change.

Added a really big test to exercise this.
2017-04-19 12:42:47 -07:00
Michael Schurter
7de3adaf87 Remove commits return value
...and still protect against leaking agent entries in Consul on
shutdown.
2017-04-19 12:42:47 -07:00
Michael Schurter
5d75efc397 Explain PortLabel handling in RegisterAgent 2017-04-19 12:42:47 -07:00
Michael Schurter
48269c36b0 Plumb alloc id + task name into script check logs 2017-04-19 12:42:47 -07:00
Michael Schurter
80617a9c1a Stop being lazy and just type out struct{}{} 2017-04-19 12:42:47 -07:00
Michael Schurter
a3fc76b63a Fix comment to reflect reality 2017-04-19 12:42:47 -07:00
Michael Schurter
63a8307255 Move ScriptExecutor to driver 2017-04-19 12:42:47 -07:00
Michael Schurter
b4e40efa15 Fix shutdown when consul is down 2017-04-19 12:42:47 -07:00
Michael Schurter
80882f3e0e Switch ServiceClient to synchronizing state
Previously it applied a stream of operations. Reconciling state is less
complex and error prone at the cost of slightly higher CPU/memory usage.
2017-04-19 12:42:47 -07:00
Michael Schurter
b6937912d8 Add UpdateTask method instead of Remove/Add 2017-04-19 12:42:47 -07:00
Michael Schurter
7930d35925 Remove some lies 2017-04-19 12:42:47 -07:00
Michael Schurter
e13b149628 Remove unused syncInterval 2017-04-19 12:42:47 -07:00