fingerprint: initial fingerprint of Vault/Consul should be periodic (#25102)

In #24526 we updated the Consul and Vault fingerprints so that they are no
longer periodic. This fixed a problem that cluster admins reported where rolling
updates of Vault or Consul would cause a thundering herd of fingerprint updates
across the whole cluster.

But if Consul/Vault is not available during the initial fingerprint, it will
never get fingerprinted again. This is challenging for cluster updates and black
starts because the implicit service startup ordering may require
reloads. Instead, have the fingerprinter run periodically but mark that it has
made its first successful fingerprint of all Consul/Vault clusters. At that
point, we can skip further periodic updates. The `Reload` method will reset the
mark and allow the subsequent fingerprint to run normally.

Fixes: https://github.com/hashicorp/nomad/issues/25097
Ref: https://github.com/hashicorp/nomad/pull/24526
Ref: https://github.com/hashicorp/nomad/issues/24049
This commit is contained in:
Tim Gross
2025-02-13 14:26:04 -05:00
committed by GitHub
parent c2298e0999
commit 8c57fd5eb0
7 changed files with 167 additions and 38 deletions

View File

@@ -4,9 +4,8 @@ Documentation=https://nomadproject.io/docs/
Wants=network-online.target
After=network-online.target
# When using Nomad with Consul it is not necessary to start Consul first. These
# lines start Consul before Nomad as an optimization to avoid Nomad logging
# that Consul is unavailable at startup.
# When using Nomad with Consul you should start Consul first, so that running
# allocations using Consul are restored correctly during startup.
#Wants=consul.service
#After=consul.service