Prior to 2409f72 the code compared the modification index of a job to itself. Afterwards, the code compared the creation index of the job to itself. In either case there should never be a case of re-parenting of allocs causing the evaluation to trivially always result in false, which leads to unreclaimable memory.
Prior to this change allocations and evaluations for batch jobs were never garbage collected until the batch job was explicitly stopped. The new `batch_eval_gc_threshold` server configuration controls how often they are collected. The default threshold is `24h`.
When registering a job with a service and 'consul.allow_unauthenticated=false',
we scan the given Consul token for an acceptable policy or role with an
acceptable policy, but did not scan for an acceptable service identity (which
is backed by an acceptable virtual policy). This PR updates our consul token
validation to also accept a matching service identity when registering a service
into Consul.
Fixes#15902
Implement a metric for RPC requests with labels on the identity, so that
administrators can monitor the source of requests within the cluster. This
changeset demonstrates the change with the new `ACL.WhoAmI` RPC, and we'll wire
up the remaining RPCs once we've threaded the new pre-forwarding authentication
through the all.
Note that metrics are measured after we forward but before we return any
authentication error. This ensures that we only emit metrics on the server that
actually serves the request. We'll perform rate limiting at the same place.
Includes telemetry configuration to omit identity labels.
In #15417 we added a new `Authenticate` method to the server that returns an
`AuthenticatedIdentity` struct. This changeset implements this method for a
small number of RPC endpoints that together represent all the various ways in
which RPCs are sent, so that we can validate that we're happy with this
approach.
* Add config elements
* Wire in snapshot configuration to raft
* Add hot reload of raft config
* Add documentation for new raft settings
* Add changelog
This PR adds support for configuring `proxy.upstreams[].config` for
Consul Connect upstreams. This is an opaque config value to Nomad -
the data is passed directly to Consul and is unknown to Nomad.
The command line flag parsing and the HTTP header parsing for CSI secrets
incorrectly split at more than one '=' rune, making it impossible to use secrets
that included that rune.
In #15605 we fixed the bug where the presense of "stale" query parameter
was mean to imply stale, even if the value of the parameter was "false"
or malformed. In parsing, we missed the case where the slice of values
would be nil which lead to a failing test case that was missed because
CI didn't run against the original PR.
This PR removes usages of `consul/sdk/testutil/retry`, as part of the
ongoing effort to remove use of any non-API module from Consul.
There is one remanining usage in the helper/freeport package, but that
will get removed as part of #15589
* command: fixup job multi-stop test
This PR refactors the StopCommand test that runs 10 jobs and then
passes them all to one invokation of 'job stop'.
* test: swap use of assert for must
* test: cleanup job files we create
* command: fixup job stop failure tests
Now that JobStop works on concurrent jobs, the error messages are
different.
* cleanup: use multiple post scripts
This change add the RPC ACL binding rule handlers. These handlers
are responsible for the creation, updating, reading, and deletion
of binding rules.
The write handlers are feature gated so that they can only be used
when all federated servers are running the required version.
The HTTP API handlers and API SDK have also been added where
required. This allows the endpoints to be called from the API by users
and clients.
This PR is a continuation of #14917, where we missed the ipv6 cases.
Consul auto-inserts tagged_addresses for keys
- lan_ipv4
- wan_ipv4
- lan_ipv6
- wan_ipv6
even though the service registration coming from Nomad does not contain such
elements. When doing the differential between services Nomad expects to be
registered vs. the services actually registered into Consul, we must first
purge these automatically inserted tagged_addresses if they do not exist in
the Nomad view of the Consul service.
Currently CRUD code that operates on SSO auth methods does not return created or updated object upon creation/update. This is bad UX and inconsistent behavior compared to other ACL objects like roles, policies or tokens.
This PR fixes it.
Relates to #13120
This PR adds trace logging around the differential done between a Nomad service
registration and its corresponding Consul service registration, in an effort
to shed light on why a service registration request is being made.
During unusual outage recovery scenarios on large clusters, a backlog of
millions of evaluations can appear. In these cases, the `eval delete` command can
put excessive load on the cluster by listing large sets of evals to extract the
IDs and then sending larges batches of IDs. Although the command's batch size
was carefully tuned, we still need to be JSON deserialize, re-serialize to
MessagePack, send the log entries through raft, and get the FSM applied.
To improve performance of this recovery case, move the batching process into the
RPC handler and the state store. The design here is a little weird, so let's
look a the failed options first:
* A naive solution here would be to just send the filter as the raft request and
let the FSM apply delete the whole set in a single operation. Benchmarking with
1M evals on a 3 node cluster demonstrated this can block the FSM apply for
several minutes, which puts the cluster at risk if there's a leadership
failover (the barrier write can't be made while this apply is in-flight).
* A less naive but still bad solution would be to have the RPC handler filter
and paginate, and then hand a list of IDs to the existing raft log
entry. Benchmarks showed this blocked the FSM apply for 20-30s at a time and
took roughly an hour to complete.
Instead, we're filtering and paginating in the RPC handler to find a page token,
and then passing both the filter and page token in the raft log. The FSM apply
recreates the paginator using the filter and page token to get roughly the same
page of evaluations, which it then deletes. The pagination process is fairly
cheap (only abut 5% of the total FSM apply time), so counter-intuitively this
rework ends up being much faster. A benchmark of 1M evaluations showed this
blocked the FSM apply for 20-30ms at a time (typical for normal operations) and
completes in less than 4 minutes.
Note that, as with the existing design, this delete is not consistent: a new
evaluation inserted "behind" the cursor of the pagination will fail to be
deleted.