nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Chris Roberts	eeec603975	command: prevent early exit from graceful shutdown (#26023 ) While waiting for the agent to leave during a graceful shutdown the wait can be interrupted immediately if another signal is received. It is common that while waiting a `SIGPIPE` is received from journald causing the wait to end early. This results in the agent not finishing the leave process and reporting an error when the process has stopped. Instead of allowing any signal to interrupt the wait, the signal is checked for a `SIGPIPE` and if matched will continue waiting.	2025-06-12 08:56:55 -07:00
Juanadelacuesta	9288a3141a	func and docs: Use the config from the client and not from the agent that is already parsed. Add the breaking change to the release notes	2025-04-30 10:53:02 +02:00
Juanadelacuesta	949571e313	func: read the config from the agent, dont reparse	2025-04-24 05:01:53 +02:00
Juanadelacuesta	46343ee56e	func: use the client's configured drain deadline to calculate the graceful timeout when terminating an agent	2025-04-23 23:59:50 +02:00
Juanadelacuesta	c91f24681d	style: add changelog	2025-04-23 23:28:54 +02:00
Juana De La Cuesta	9778a31e29	Update command/agent/command.go Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-04-23 23:18:09 +02:00
Juana De La Cuesta	39b3d63172	Update command/agent/command.go Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-04-23 23:18:02 +02:00
Juana De La Cuesta	313f430fdd	Update command/agent/command.go Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2025-04-23 23:17:36 +02:00
Juanadelacuesta	b47b962439	fix: use a timer instead of a time.After to avoid memory leaks	2025-04-23 16:26:46 +02:00
Juanadelacuesta	38c27b7e7f	fix: make teh disctintion when for os.Interrupt	2025-04-23 16:20:04 +02:00
Juanadelacuesta	adf038b495	fix: correct the logic for LeaveOnTerm or LeaveOnInt depending on the incoming signal	2025-04-23 16:03:12 +02:00
Juanadelacuesta	b375974bc3	style: add comments	2025-04-23 15:47:37 +02:00
Juanadelacuesta	c5c4272aee	func: force agent return if there is an error on reload	2025-04-23 15:14:48 +02:00
Arian van Putten	d28af58cbb	agent: implement sd-notify reload correctly (#25636 ) First of all, we should not send the unix time, but the monotonic time. Second of all, RELOADING= and MONOTONIC_USEC fields should be sent in single message not two separate messages. From the man page of [systemd.service](https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#Type=) > notification message via sd_notify(3) that contains the "RELOADING=1" field in > combination with "MONOTONIC_USEC=" set to the current monotonic time (i.e. > CLOCK_MONOTONIC in clock_gettime(2)) in μs, formatted as decimal string. [sd_notify](https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html) now has code samples of the protocol to clarify. Without these changes, if you'd set Type=notify-reload on the agen'ts systemd unit, systemd would kill the service due to the service not responding to reload correctly.	2025-04-14 11:38:56 -04:00
Nikita Eliseev	76fb3eb9a1	rpc: added configuration for yamux session (#25466 ) Fixes: https://github.com/hashicorp/nomad/issues/25380	2025-04-02 10:58:23 -04:00
James Rasell	61b2b9d3d0	agent: Improve retry joiner code with small refactor. (#25422 ) The agent retry joiner implementation had different parameters to control its execution for agents running in server and client mode. The agent would set up individual joiners depending on the agent mode, making the object parameter overhead unrequired. This change removes the excess configuration options for the joiner, reducing code complexity slighly and hopefully making future modifications in this area easier to make.	2025-03-18 15:55:52 +00:00
Michael Smithhisler	5c4d0e923d	consul: Remove legacy token based authentication workflow (#25217 )	2025-03-05 15:38:11 -05:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
Matt Keeler	833e240597	Upgrade to using hashicorp/go-metrics@v0.5.4 (#24856 ) * Upgrade to using hashicorp/go-metrics@v0.5.4 This also requires bumping the dependencies for: * memberlist * serf * raft * raft-boltdb * (and indirectly hashicorp/mdns due to the memberlist or serf update) Unlike some other HashiCorp products, Nomads root module is currently expected to be consumed by others. This means that it needs to be treated more like our libraries and upgrade to hashicorp/go-metrics by utilizing its compat packages. This allows those importing the root module to control the metrics module used via build tags.	2025-01-31 15:22:00 -05:00
Daniel Bennett	49c147bcd7	dynamic host volumes: change env vars, fixup auto-delete (#24943 ) * plugin env: DHV_HOST_PATH->DHV_VOLUMES_DIR * client config: host_volumes_dir * plugin env: add namespace+nodepool * only auto-delete after error saving client state on initial create	2025-01-27 10:36:53 -06:00
Michael Schurter	63dacd2d6e	update vault token warning from 1.9->1.10 (#24884 ) Fixes #24847	2025-01-17 10:56:06 -08:00
James Rasell	63ea13be77	agent: Ensure logger set up method is public. (#24886 ) This is needed by a Nomad Enterprise code path.	2025-01-17 13:47:06 +00:00
James Rasell	753f752cdd	agent: remove unused log filter and unrequired library. (#24873 ) The Nomad agent used a log filter to ensure logs were written at the expected level. Since the use of hclog this is not required, as hclog acts as the gate keeper and filter for logging. All log writers accept messages from hclog which has already done the filtering.	2025-01-17 07:51:27 +00:00
James Rasell	1ae9785f9b	agent: Fix a bug where all syslog lines are notice when using JSON (#24865 ) The agent syslog write handler was unable to handle JSON log lines correctly, meaning all syslog entries when using JSON log format showed as NOTICE level. This change adds a new handler to the Nomad agent which can parse JSON log lines and correctly understand the expected log level entry. The change also removes the use of a filter from the default log format handler. This is not needed as the logs are fed into the syslog handler via hclog, which is responsible for level filtering.	2025-01-16 07:23:08 +00:00
James Rasell	8d201a82fd	agent: Fixed a bug where syslog error messages marked as notice. (#24820 ) The mapping between Nomad log level identifiers and syslog priorities did not handle the error level string correctly.	2025-01-15 08:02:53 +00:00
Charlie Voiselle	30ab8897d2	deps: Switch from mitchellh/cli to hashicorp/cli (#19321 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2024-12-19 15:41:11 +00:00
Daniel Bennett	46a39560bb	dynamic host volumes: fingerprint client plugins (#24589 )	2024-12-19 09:25:54 -05:00
guifran001	1c44521543	client: Add a preferred address family option for network-interface (#23389 ) to prefer ipv4 or ipv6 when deducing IP from network interface Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2024-07-12 15:30:38 -05:00
Tim Gross	54fc146432	agent: add support for sdnotify protocol (#20528 ) Nomad agents expect to receive `SIGHUP` to reload their configuration. The signal handler for this is installed fairly late in agent startup, after the client or server components are up and running. This means that configuration management tools can potentially reload the configuration before the agent can handle it, causing the agent to crash. We don't want to allow configuration reload during client or server component startup, because it would significantly complicate initialization. Instead, we'll implement the systemd notify protocol. This causes systemd to block sending configuration reload signals until the agent is actually ready. Users can still bypass this by sending signals directly. Note that there are several Go libraries that implement the sdnotify protocol, but most are part of much larger projects which would create a lot of dependabot burden. The bits of the protocol we need are extremely simple to implement in a just a couple of functions. For non-Linux or non-systemd Linux systems, this feature is a no-op. In future work we could potentially implement service notification for Windows as well. Fixes: https://github.com/hashicorp/nomad/issues/3885	2024-05-03 13:42:07 -04:00
James Rasell	facc3e8013	agent: allow configuration of in-memory telemetry sink. (#20166 ) This change adds configuration options for setting the in-memory telemetry sink collection and retention durations. This sink backs the metrics JSON API and previously had hard-coded default values. The new options are particularly useful when running development or debug environments, where metrics collection is desired at a fast and granular rate.	2024-03-25 15:00:18 +00:00
Seth Hoenig	05937ab75b	exec2: add client support for unveil filesystem isolation mode (#20115 ) * exec2: add client support for unveil filesystem isolation mode This PR adds support for a new filesystem isolation mode, "Unveil". The mode introduces a "alloc_mounts" directory where tasks have user-owned directory structure which are bind mounts into the real alloc directory structure. This enables a task driver to use landlock (and maybe the real unveil on openbsd one day) to isolate a task to the task owned directory structure, providing sandboxing. * actually create alloc-mounts-dir directory * fix doc strings about alloc mount dir paths	2024-03-13 08:24:17 -05:00
James Rasell	41555b6370	cli: Fix minor help formatting issue in agent command. (#19743 )	2024-01-17 12:18:00 +00:00
Mike Nomitch	31f4296826	Adds support for failures before warning to Consul service checks (#19336 ) Adds support for failures before warning and failures before critical to the automatically created Nomad client and server services in Consul	2023-12-14 11:33:31 -08:00
Luiz Aoqui	099ee06a60	Revert "deps: update go-metrics to v0.5.3 (#19190 )" (#19374 ) * Revert "deps: update go-metrics to v0.5.3 (#19190)" This reverts commit `ddb060d8b3`. * changelog: add entry for #19374	2023-12-08 08:46:55 -05:00
Luiz Aoqui	c624dc2121	config: fix loading Vault token from env var (#19349 ) The `defaultVault` variable is a pointer to the Vault configuration named `default`. Initially, this variable points to the Vault configuration that is used to load CLI flag values, but after those are merged with the default and config file values the pointer reference must be updated before mutating the config with environment variable values.	2023-12-07 11:56:53 -05:00
Luiz Aoqui	27d2ad1baf	cli: add `-dev-consul` and `-dev-vault` agent mode (#19327 ) The `-dev-consul` and `-dev-vault` flags add default identities and configuration to the Nomad agent to connect and use the workload identity integration with Consul and Vault.	2023-12-07 11:51:20 -05:00
Luiz Aoqui	ddb060d8b3	deps: update go-metrics to v0.5.3 (#19190 ) Update `go-metrics` to v0.5.3 to pick https://github.com/hashicorp/go-metrics/pull/146.	2023-11-28 12:37:57 -05:00
Adriano Caloiaro	f66eb83fc0	Add `go-netaddrs` support to `retry_join` (#18745 )	2023-11-15 10:07:18 -05:00
Tim Gross	9d075c44b2	config: remove old Vault/Consul config blocks from parser (#18997 ) Remove the now-unused original configuration blocks for Consul and Vault from the agent configuration parsing. When the agent needs to refer to a Consul or Vault block it will always be for a specific cluster for the task/service (or the default cluster for the agent's own use). This is third of three changesets for this work. Fixes: https://github.com/hashicorp/nomad/issues/18947 Ref: https://github.com/hashicorp/nomad/pull/18991 Ref: https://github.com/hashicorp/nomad/pull/18994	2023-11-08 09:30:08 -05:00
Tim Gross	8f8265fa6d	add deprecation warning for Vault/Consul token usage (#18863 ) Submitting a Consul or Vault token with a job is deprecated in Nomad 1.7 and intended for removal in Nomad 1.9. Add a deprecation warning to the CLI when the user passes in the appropriate flag or environment variable. Nomad agents will no longer need a Vault token when configured with workload identity, and we'll ignore Vault tokens in the agent config after Nomad 1.9. Log a warning at agent startup. Ref: https://github.com/hashicorp/nomad/issues/15617 Ref: https://github.com/hashicorp/nomad/issues/15618	2023-10-26 10:46:02 -04:00
Michael Schurter	a806363f6d	OpenID Configuration Discovery Endpoint (#18691 ) Added the [OIDC Discovery](https://openid.net/specs/openid-connect-discovery-1_0.html) `/.well-known/openid-configuration` endpoint to Nomad, but it is only enabled if the `server.oidc_issuer` parameter is set. Documented the parameter, but without a tutorial trying to actually _use_ this will be very hard. I intentionally did not use https://github.com/hashicorp/cap for the OIDC configuration struct because it's built to be a compliant OIDC provider. Nomad is not trying to be compliant initially because compliance to the spec does not guarantee it will actually satisfy the requirements of third parties. I want to avoid the problem where in an attempt to be standards compliant we ship configuration parameters that lock us in to a certain behavior that we end up regretting. I want to add parameters and behaviors as there's a demonstrable need. Users always have the escape hatch of providing their own OIDC configuration endpoint. Nomad just needs to know the Issuer so that the JWTs match the OIDC configuration. There's no reason the actual OIDC configuration JSON couldn't live in S3 and get served directly from there. Unlike JWKS the OIDC configuration should be static, or at least change very rarely. This PR is just the endpoint extracted from #18535. The `RS256` algorithm still needs to be added in hopes of supporting third parties such as [AWS IAM OIDC Provider](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html). Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-10-20 17:11:41 -07:00
James Rasell	1ffdd576bb	agent: add config option to enable file and line log detail. (#18768 )	2023-10-16 15:59:16 +01:00
Tim Gross	5001bf4547	consul: use constant instead of "default" literal (#18611 ) Use the constant `structs.ConsulDefaultCluster` instead of the string literal "default", as we've done for Vault.	2023-09-28 16:50:21 -04:00
Luiz Aoqui	868aba57bb	vault: update identity name to start with `vault_` (#18591 ) * vault: update identity name to start with `vault_` In the original proposal, workload identities used to derive Vault tokens were expected to be called just `vault`. But in order to support multiple Vault clusters it is necessary to associate identities with specific Vault cluster configuration. This commit implements a new proposal to have Vault identities named as `vault_<cluster>`.	2023-09-27 15:53:28 -03:00
Juana De La Cuesta	124272c050	server: Add reporting option to agent (#18572 ) * func: add reporting option to agent * func: add test for merge and fix comments * Update config_ce.go * Update config_ce.go * Update config_ce.go * fix: add reporting config to default configuration and update to use must over require * Update command/agent/config_parse.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update nomad/structs/config/reporting.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update nomad/structs/config/reporting.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * style: rename license and reporting config * fix: use default function instead of empty struct --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-09-27 00:11:32 +02:00
Tim Gross	3ee6c31241	ACLs: allow/deny/default config for Consul/Vault clusters by namespace (#18425 ) In Nomad Enterprise when multiple Vault/Consul clusters are configured, cluster admins can control access to clusters for jobs via namespace ACLs, similar to how we've done so for node pools. This changeset updates the ACL configuration structs, but doesn't wire them up.	2023-09-08 11:37:20 -04:00
Tim Gross	a8bad048b6	config: parsing support for multiple Consul clusters in agent config (#18255 ) Add the plumbing we need to accept multiple Consul clusters in Nomad agent configuration, to support upcoming Nomad Enterprise features. The `consul` blocks are differentiated by a new `name` field, and if the `name` is omitted it becomes the "default" Consul configuration. All blocks with the same name are merged together, as with the existing behavior. As with the `vault` block, we're still using HCL1 for parsing configuration and the `Decode` method doesn't parse multiple blocks differentiated only by a field name without a label. So we've had to add an extra parsing pass, similar to what we've done for HCL1 jobspecs. This also revealed a subtle bug in the `vault` block handling of extra keys when there are multiple `vault` blocks, which I've fixed here. For now, all existing consumers will use the "default" Consul configuration, so there's no user-facing behavior change in this changeset other than the contents of the agent self API. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-08-18 15:25:16 -04:00
Tim Gross	74b796e6d0	config: parsing support for multiple Vault clusters in agent config (#18224 ) Add the plumbing we need to accept multiple Vault clusters in Nomad agent configuration, to support upcoming Nomad Enterprise features. The `vault` blocks are differentiated by a new `name` field, and if the `name` is omitted it becomes the "default" Vault configuration. All blocks with the same name are merged together, as with the existing behavior. Unfortunately we're still using HCL1 for parsing configuration and the `Decode` method doesn't parse multiple blocks differentiated only by a field name without a label. So we've had to add an extra parsing pass, similar to what we've done for HCL1 jobspecs. For now, all existing consumers will use the "default" Vault configuration, so there's no user-facing behavior change in this changeset other than the contents of the agent self API. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-08-17 14:10:32 -04:00
hashicorp-copywrite[bot]	a9d61ea3fd	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:29 -05:00
Luiz Aoqui	5db9e64cdd	node pool: node pool upsert on multiregion node register (#17503 ) When registering a node with a new node pool in a non-authoritative region we can't create the node pool because this new pool will not be replicated to other regions. This commit modifies the node registration logic to only allow automatic node pool creation in the authoritative region. In non-authoritative regions, the client is registered, but the node pool is not created. The client is kept in the `initialing` status until its node pool is created in the authoritative region and replicated to the client's region.	2023-06-13 11:28:28 -04:00

1 2 3 4 5 ...

269 Commits