106 Commits

Author SHA1 Message Date
Daniel Bennett
04db81951f test: fix go 1.24 test complaints (#25346)
e.g. Error: nomad/leader_test.go:382:12: non-constant format string in call to (*testing.common).Fatalf
2025-03-11 11:01:39 -05:00
Michael Smithhisler
5c4d0e923d consul: Remove legacy token based authentication workflow (#25217) 2025-03-05 15:38:11 -05:00
Tim Gross
df67e74615 Consul: add preflight checks for Envoy bootstrap (#23381)
Nomad creates Consul ACL tokens and service registrations to support Consul
service mesh workloads, before bootstrapping the Envoy proxy. Nomad always talks
to the local Consul agent and never directly to the Consul servers. But the
local Consul agent talks to the Consul servers in stale consistency mode to
reduce load on the servers. This can result in the Nomad client making the Envoy
bootstrap request with a tokens or services that have not yet replicated to the
follower that the local client is connected to. This request gets a 404 on the
ACL token and that negative entry gets cached, preventing any retries from
succeeding.

To workaround this, we'll use a method described by our friends over on
`consul-k8s` where after creating the objects in Consul we try to read them from
the local agent in stale consistency mode (which prevents a failed read from
being cached). This cannot completely eliminate this source of error because
it's possible that Consul cluster replication is unhealthy at the time we need
it, but this should make Envoy bootstrap significantly more robust.

This changset adds preflight checks for the objects we create in Consul:
* We add a preflight check for ACL tokens after we login via via Workload
  Identity and in the function we use to derive tokens in the legacy
  workflow. We do this check early because we also want to use this token for
  registering group services in the allocrunner hooks.
* We add a preflight check for services right before we bootstrap Envoy in the
  taskrunner hook, so that we have time for our service client to batch updates
  to the local Consul agent in addition to the local agent sync.

We've added the timeouts to be configurable via node metadata rather than the
usual static configuration because for most cases, users should not need to
touch or even know these values are configurable; the configuration is mostly
available for testing.


Fixes: https://github.com/hashicorp/nomad/issues/9307
Fixes: https://github.com/hashicorp/nomad/issues/10451
Fixes: https://github.com/hashicorp/nomad/issues/20516

Ref: https://github.com/hashicorp/consul-k8s/pull/887
Ref: https://hashicorp.atlassian.net/browse/NET-10051
Ref: https://hashicorp.atlassian.net/browse/NET-9273
Follow-up: https://hashicorp.atlassian.net/browse/NET-10138
2024-06-27 10:15:37 -04:00
Tim Gross
140747240f consul: include admin partition in JWT login requests (#22226)
When logging into a JWT auth method, we need to explicitly supply the Consul
admin partition if the local Consul agent is in a partition. We can't derive
this from agent configuration because the Consul agent's configuration is
canonical, so instead we get the partition from the fingerprint (if
available). This changeset updates the Consul client constructor so that we
close over the partition from the fingerprint.

Ref: https://hashicorp.atlassian.net/browse/NET-9451
2024-05-29 16:31:09 -04:00
Luiz Aoqui
1a2d41d30b consul: refactor allocrunner consul hook (#19229)
Refactor the JWT token derivation logic to only take a single request
since it was only ever called with a map of length one.

The original implementation received multiple requets to match the
legacy flow, but but legacy flow requests were batched from the Nomad
client to the server, which doesn't happen for JWT. Each JWT request
goes directly from the Nomad client to the Consul agent, so there is no
batching involved.
2023-11-30 10:55:03 -05:00
Piotr Kazmierczak
6a98e45c53 client: add metadata to tokens requested by Consul client (#19196)
This way tokens created by Nomad workloads are easier to keep track of.
2023-11-28 16:09:31 +01:00
Piotr Kazmierczak
711da2e653 client: change Consul client interface (#19140)
DeriveSITokenWithJWT is a misleading method name, because it's used to derive
Consul ACL tokens for other purposes too.
2023-11-21 16:01:26 +01:00
Tim Gross
c7c3b3ae33 revoke Consul tokens obtained via WI when alloc stops (#19034)
Add a `Postrun` and `Destroy` hook to the allocrunner's `consul_hook` to ensure
that Consul tokens we've created via WI get revoked via the logout API when
we're done with them. Also add the logout to the `Prerun` hook if we've hit an
error.
2023-11-09 10:08:09 -05:00
Luiz Aoqui
6d4b62200b log: add Consul and Vault cluster name to output (#18817)
Ensure Consul and Vault loggers have the cluster name as an attribute to
help differentiate log source.
2023-10-20 14:03:56 -04:00
Piotr Kazmierczak
0410b8acea client: remove unnecessary debugging from consul client mock (#18807) 2023-10-19 16:23:42 +02:00
Piotr Kazmierczak
16d71582f6 client: consul_hook tests (#18780)
ref https://github.com/hashicorp/team-nomad/issues/404
2023-10-18 20:02:35 +02:00
Tim Gross
ac56855f07 consul: add multi-cluster support to client constructors (#18624)
When agents start, they create a shared Consul client that is then wrapped as
various interfaces for testability, and used in constructing the Nomad client
and server. The interfaces that support workload services (rather than the Nomad
agent itself) need to support multiple Consul clusters for Nomad
Enterprise. Update these interfaces to be factory functions that return the
Consul client for a given cluster name. Update the `ServiceClient` to split
workload updates between clusters by creating a wrapper around all the clients
that delegates to the cluster-specific `ServiceClient`.

Ref: https://github.com/hashicorp/team-nomad/issues/404
2023-10-17 13:46:49 -04:00
Piotr Kazmierczak
5dab41881b client: new consul_hook (#18557)
This PR introduces a new allocrunner-level consul_hook which iterates over
services and tasks, if their provider is consul, fetches consul tokens for all of
them, stores them in AllocHookResources and in task secret dirs.

Ref: hashicorp/team-nomad#404

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2023-09-29 17:41:48 +02:00
Piotr Kazmierczak
2fffb96604 client: new Consul client (#18370)
This PR introduces a new Consul client that returns SI tokens based on requests
that contain JWTs.
2023-09-05 20:55:36 +02:00
hashicorp-copywrite[bot]
2d35e32ec9 Update copyright file headers to BUSL-1.1 2023-08-10 17:27:15 -05:00
hashicorp-copywrite[bot]
f005448366 [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
James Rasell
d49cf2388a Merge branch 'main' into f-1.3-boogie-nights 2022-03-23 09:41:25 +01:00
Seth Hoenig
b242957990 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
James Rasell
6e8f32a290 client: refactor common service registration objects from Consul.
This commit performs refactoring to pull out common service
registration objects into a new `client/serviceregistration`
package. This new package will form the base point for all
client specific service registration functionality.

The Consul specific implementation is not moved as it also
includes non-service registration implementations; this reduces
the blast radius of the changes as well.
2022-03-15 09:38:30 +01:00
Seth Hoenig
a97254fa20 consul: plubming for specifying consul namespace in job/group
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
2021-04-05 10:03:19 -06:00
Seth Hoenig
bdeb73cd2c consul/connect: dynamically select envoy sidecar at runtime
As newer versions of Consul are released, the minimum version of Envoy
it supports as a sidecar proxy also gets bumped. Starting with the upcoming
Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current
versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the
default implementation of Connect sidecar proxy.

This PR introduces a change such that each Nomad Client will query its
local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545)
and then launch the Connect sidecar proxy task using the latest supported version
of Envoy. If the `SupportedProxies` API component is not available from
Consul, Nomad will fallback to the old version of Envoy supported by old
versions of Consul.

Setting the meta configuration option `meta.connect.sidecar_image` or
setting the `connect.sidecar_task` stanza will take precedence as is
the current behavior for sidecar proxies.

Setting the meta configuration option `meta.connect.gateway_image`
will take precedence as is the current behavior for connect gateways.

`meta.connect.sidecar_image` and `meta.connect.gateway_image` may make
use of the special `${NOMAD_envoy_version}` variable interpolation, which
resolves to the newest version of Envoy supported by the Consul agent.

Addresses #8585 #7665
2020-10-13 09:14:12 -05:00
Seth Hoenig
0957c24646 docs: remove erroneous characters from comment 2020-03-30 13:26:48 -06:00
Seth Hoenig
7a7701a4eb consul: annotate Consul interfaces with ACLs 2020-03-30 10:17:28 -06:00
Seth Hoenig
d24d470775 comments: cleanup some leftover debug comments and such 2020-01-31 19:04:35 -06:00
Seth Hoenig
674ccaa122 nomad: proxy requests for Service Identity tokens between Clients and Consul
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
2020-01-31 19:03:53 -06:00
Seth Hoenig
f8666bb1f9 client: enable nomad client to request and set SI tokens for tasks
When a job is configured with Consul Connect aware tasks (i.e. sidecar),
the Nomad Client should be able to request from Consul (through Nomad Server)
Service Identity tokens specific to those tasks.
2020-01-31 19:03:38 -06:00
Drew Bailey
3b033b2ef5 allow only positive shutdown delay
more explicit test case, remove select statement
2019-12-16 11:38:30 -05:00
Nick Ethier
387b016ac4 client: improve group service stanza interpolation and check_re… (#6586)
* client: improve group service stanza interpolation and check_restart support

Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.

The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
2019-11-18 13:04:01 -05:00
Tim Gross
40368d2c63 support script checks for task group services (#6197)
In Nomad prior to Consul Connect, all Consul checks work the same
except for Script checks. Because the Task being checked is running in
its own container namespaces, the check is executed by Nomad in the
Task's context. If the Script check passes, Nomad uses the TTL check
feature of Consul to update the check status. This means in order to
run a Script check, we need to know what Task to execute it in.

To support Consul Connect, we need Group Services, and these need to
be registered in Consul along with their checks. We could push the
Service down into the Task, but this doesn't work if someone wants to
associate a service with a task's ports, but do script checks in
another task in the allocation.

Because Nomad is handling the Script check and not Consul anyways,
this moves the script check handling into the task runner so that the
task runner can own the script check's configuration and
lifecycle. This will allow us to pass the group service check
configuration down into a task without associating the service itself
with the task.

When tasks are checked for script checks, we walk back through their
task group to see if there are script checks associated with the
task. If so, we'll spin off script check tasklets for them. The
group-level service and any restart behaviors it needs are entirely
encapsulated within the group service hook.
2019-09-03 15:09:04 -04:00
Michael Schurter
eeacb87f3b connect: register group services with Consul
Fixes #6042

Add new task group service hook for registering group services like
Connect-enabled services.

Does not yet support checks.
2019-08-20 12:25:10 -07:00
Michael Schurter
df0a6dc34e test: add some extra logging 2019-01-14 09:56:53 -08:00
Michael Schurter
796f0ca063 fix build errors post merges 2018-10-16 16:53:31 -07:00
Michael Schurter
d0842e7b00 test: cleanup mock consul service client
Updated to hclog.

It exposed fields that required an unexported lock to access. Created a
getter methodn instead. Only old allocrunner currently used this
feature.
2018-10-16 16:53:31 -07:00
Alex Dadgar
40d095fd1a agent + consul 2018-09-13 10:43:40 -07:00
Alex Dadgar
a62e412b88 Refactor - wip 2018-06-12 10:23:45 -07:00
Sean Chittenden
57c2c819e8 Move package client/consul/sync to command/agent/consul.
This has been done to allow the Server and Client to reuse the same
Syncer because the Agent may be running Client, Server, or both
simultaneously and we only want one Syncer object alive in the agent.
2016-06-10 15:54:39 -04:00
Sean Chittenden
e858928d68 Rename Syncer.SetServiceIdentifier to SetServiceRegPrefix()
This attribute isn't actually an identifier because it can represent
a collection of services.  Rename `serviceIdentifier` to
`serviceRegPrefix which more accurately conveys the intention of this
Syncer attribute.

While here, also rename `SetServiceIdentifier()` to `SetServiceRegPrefix()`
and `GenerateServiceIdentifier()` to `GenerateServicePrefix()`.
2016-06-10 15:54:39 -04:00
Sean Chittenden
74e691cab1 Change the API signature of Syncer.SyncServices().
SyncServices() immediately attempts to sync whatever information
the process has with Consul.  Previously this method would take an
argument of the exclusive list of services that should exist,
however this is not condusive to having a Nomad Client and Nomad
Server share the same consul.Syncer.
2016-06-10 15:54:39 -04:00
Sean Chittenden
cf8beb7ba9 Change the signature of the PeriodicCallback to return an error
I *KNEW* I should have done this when I wrote it, but didn't want to
go back and audit the handlers to include the appropriate return
handling, but now that the code is taking shape, make this change.
2016-06-10 15:54:39 -04:00
Sean Chittenden
a2081159b4 Rename structs.Services to structs.ConsulServices 2016-06-10 15:54:39 -04:00
Sean Chittenden
107fc1bb81 Rename createCheck() to createDelegatedCheck() for clarity 2016-06-10 15:54:39 -04:00
Sean Chittenden
1352f7f0e6 Change client/consul.NewSyncer() to accept a shutdown channel
In addition to the API changing, consul.Syncer can now be signaled
to shutdown via the Shutdown() method, which will call the Run()'ing
sync task to exit gracefully.
2016-06-10 15:54:39 -04:00
Sean Chittenden
86b5d318f8 Move const block to the top of the file.
Requested by: @dadgar
2016-06-10 15:50:11 -04:00
Sean Chittenden
07799b636a Nuke the last of the explicit types in favor of using language idioms 2016-06-10 15:50:11 -04:00
Sean Chittenden
6264a8eff6 Unused code wasn't as unused as I thought. Restore. 2016-06-10 15:50:11 -04:00
Sean Chittenden
b2357598ba Register the serf service with the Nomad server service.
This will be unused in this PR.
2016-06-10 15:50:11 -04:00
Sean Chittenden
bf4f0310b4 Remove unused function. 2016-06-10 15:50:11 -04:00
Sean Chittenden
3d22c22bf5 Remove types.ShutdownChannel and replace with chan struct{} 2016-06-10 15:50:11 -04:00
Sean Chittenden
4d47eedd58 Teach Client to reuse an Agent's consulSyncer.
"There can be only one."
2016-06-10 15:50:11 -04:00
Sean Chittenden
bc86e897ed Register two services each for clients and servers, http and rpc.
In order to give clients a fighting chance to talk to the right port,
differentiate RPC services from HTTP services by registering two
services with different tags.  This yields
`rpc.nomad-server.service.consul` and
`http.nomad-server.service.consul` which is immensely more useful to
clients attempting to bootstrap their world.
2016-06-10 15:50:11 -04:00