nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-03 17:05:43 +03:00

Author	SHA1	Message	Date
Piotr Kazmierczak	0d14dd96ca	eval_broker: track enqueue and dequeue times (#20329 ) Adds new metrics to the eval broker that track times of evaluations enqueueing and dequeueing.	2024-04-15 16:16:50 +02:00
Tim Gross	1739f94e84	docs: fix a broken link on the Consul index page (#20387 )	2024-04-12 15:31:48 -04:00
Tim Gross	43281f6038	docs: provide guidance on using Consul DNS (#20369 ) Add a standalone section to the Consul integration docs showing how to configure both the Consul agent and the workload to take advantage of Consul DNS. Include a reference to the new transparent proxy feature as well. Fixes: https://github.com/hashicorp/nomad/issues/18305	2024-04-12 14:38:04 -04:00
Tim Gross	1e50090776	docs: clarify "best effort" for ephemeral disk migration (#20357 ) The docs for ephemeral disk migration use the term "best effort" without outlining the requirements or the cases under which the migration can fail. Update the docs to make it obvious that ephemeral disk migration is subject to data loss. Fixes: https://github.com/hashicorp/nomad/issues/20355	2024-04-11 16:35:22 -04:00
astudentofblake	7b7ed12326	func: Allow custom paths to be added the the getter landlock (#20349 ) * func: Allow custom paths to be added the the getter landlock Fixes: 20315 * fix: slices imports fix: more meaningful examples fix: improve documentation fix: quote error output	2024-04-11 15:17:33 -05:00
Tim Gross	8298d39e78	Connect transparent proxy support Add support for Consul Connect transparent proxies Fixes: https://github.com/hashicorp/nomad/issues/10628	2024-04-10 11:00:18 -04:00
Tim Gross	9340c77b12	docs: remove extra indents in tproxy HCL examples	2024-04-10 10:21:32 -04:00
Tim Gross	e2e561da88	tproxy: documentation improvements	2024-04-10 08:55:50 -04:00
James Rasell	a7c56a6563	docs: fix incorrect formatting within ACL policy spec. (#20339 )	2024-04-09 14:46:06 +01:00
James Rasell	200b7134f0	docs: ensure namespace ACL policy capabilities are all listed. (#20306 )	2024-04-09 13:57:10 +01:00
Tim Gross	8eaf176868	client: fix IPv6 parsing for `client.servers` block (#20324 ) When the `client.servers` block is parsed, we split the port from the address. This does not correctly handle IPv6 addresses when they are in URL format (wrapped in brackets), which we require to disambiguate the port and address. Fix the parser to correctly split out the port and handle a missing port value for IPv6. Update the documentation to make the URL format requirement clear. Fixes: https://github.com/hashicorp/nomad/issues/20310	2024-04-08 15:06:27 -04:00
James Rasell	0cbd08ebf2	docs: add Digital Ocean Spaces artifact jobspec example. (#20304 )	2024-04-08 08:15:07 +01:00
Tim Gross	d1f3a72104	tproxy: `transparent_proxy` reference docs (#20241 ) Ref: https://github.com/hashicorp/nomad/pull/20175	2024-04-04 17:01:07 -04:00
Tim Gross	bb062deadc	docs: update service mesh integration docs for transparent proxy (#20251 ) Update the service mesh integration docs to explain how Consul needs to be configured for transparent proxy. Update the walkthrough to assume that `transparent_proxy` mode is the best approach, and move the manually-configured `upstreams` to a separate section for users who don't want to use Consul DNS. Ref: https://github.com/hashicorp/nomad/pull/20175 Ref: https://github.com/hashicorp/nomad/pull/20241	2024-04-04 17:01:07 -04:00
Tim Gross	a71632e3a4	docs: recommendation for maximum number of template dependencies (#20259 )	2024-04-04 11:08:49 -04:00
James Rasell	fd5a42a6ca	docs: clarify data dir default parameters and default creation. (#20268 )	2024-04-04 09:20:47 +01:00
Seth Hoenig	6ad648bec8	networking: Inject implicit constraints on CNI plugins when using bridge mode (#15473 ) This PR adds a job mutator which injects constraints on the job taskgroups that make use of bridge networking. Creating a bridge network makes use of the CNI plugins: bridge, firewall, host-local, loopback, and portmap. Starting with Nomad 1.5 these plugins are fingerprinted on each node, and as such we can ensure jobs are correctly scheduled only on nodes where they are available, when needed.	2024-03-27 16:11:39 -04:00
Tim Gross	9c2286014f	docs: update Consul compatibility matrix (#20242 ) Version of Nomad and Consul that were known not to be compatible are no longer supported in general. Update the compatibility matrix for Consul to match.	2024-03-27 16:11:14 -04:00
James Rasell	facc3e8013	agent: allow configuration of in-memory telemetry sink. (#20166 ) This change adds configuration options for setting the in-memory telemetry sink collection and retention durations. This sink backs the metrics JSON API and previously had hard-coded default values. The new options are particularly useful when running development or debug environments, where metrics collection is desired at a fast and granular rate.	2024-03-25 15:00:18 +00:00
Tim Gross	02d98b9357	operator debug: fix pprof interval handling (#20206 ) The `nomad operator debug` command saves a CPU profile for each interval, and names these files based on the interval. The same functions takes a goroutine profile, heap profile, etc. but is missing the logic to interpolate the file name with the interval. This results in the operator debug command making potentially many expensive profile requests, and then overwriting the data. Update the command to save every profile it scrapes, and number them similarly to the existing CPU profile. Additionally, the command flags for `-pprof-interval` and `-pprof-duration` were validated backwards, which meant that we always coerced the `-pprof-interval` to be the same as the `-pprof-duration`, which always resulted in a single profile being taken at the start of the bundle. Correct the check as well as change the defaults to be more sensible. Fixes: https://github.com/hashicorp/nomad/issues/20151	2024-03-25 09:01:06 -04:00
Tim Gross	bdf3ff301e	jobspec: add support for destination partition to `upstream` block (#20167 ) Adds support for specifying a destination Consul admin partition in the `upstream` block. Fixes: https://github.com/hashicorp/nomad/issues/19785	2024-03-22 16:15:22 -04:00
Tim Gross	d3ddb0aa49	docs: make it clear that federation features require ACLs (#20196 ) Our documentation has a hidden assumption that users know that federation replication requires ACLs to be enabled and bootstrapped. Add notes at some of the places users are likely to look for it. A separate follow-up PR to the federation tutorial should point to the ACL multi-region tutorial as well. Fixes: https://github.com/hashicorp/nomad/issues/20128	2024-03-22 15:15:00 -04:00
Michael Schurter	976789b8de	Small docs updates: bai rkt, cya openapi, lol ephemeral_disk "examples" (#20198 ) * docs: rip openapi spec * docs: remove useless ephemeral_disk examples	2024-03-22 11:53:25 -07:00
Tim Gross	10dd738a03	jobspec: update `gateway.ingress.service` Consul API fields (#20176 ) Add support for further configuring `gateway.ingress.service` blocks to bring this block up-to-date with currently available Consul API fields (except for namespace and admin partition, which will need be handled under a different PR). These fields are sent to Consul as part of the job endpoint submission hook for Connect gateways. Co-authored-by: Horacio Monsalvo <horacio.monsalvo@southworks.com>	2024-03-22 13:50:48 -04:00
Luiz Aoqui	b5573b7470	docs: fix `invoke_scheduler` metrics (#20172 )	2024-03-21 10:57:30 -04:00
Juana De La Cuesta	56bf253474	Add docs for disconnected block (#20147 ) Expand the job settings to include the disconnect block and set as deprecated the fields that will be replaced by it.	2024-03-20 10:08:16 +01:00
Tim Gross	dc39c20e66	docs: make recommendation for collection interval vs scrape interval (#20056 ) Metrics tools that "pull" metrics, such as Prometheus, have a configurable interval for how frequently they scrape metrics. This should be greater or equal to the Nomad `telemetry.collection_interval` to avoid re-scraping metrics that cannot have been updated in that interval. Fixes: https://github.com/hashicorp/nomad/issues/20055	2024-03-19 08:56:29 -04:00
Tim Gross	c4253470a0	autopilot: add `operator autopilot health` command (#20156 ) Add a command line operation that reports Enterprise autopilot data from the `/operator/autopilot/health` API. I've pulled this feature out of @lindleywhite's PR in the Enterprise repo. Ref: https://github.com/hashicorp/nomad-enterprise/pull/1394 Co-authored-by: Lindley <lindley@hashicorp.com>	2024-03-18 14:46:18 -04:00
Tim Gross	695bb7ffcf	docs: improve wording around autoconfiguration via Consul (#20139 ) Fixes: https://github.com/hashicorp/nomad/issues/20132	2024-03-15 08:44:58 -04:00
Giovanni Avelar	26a27bb12c	cli: add -json option on jobs status command (#18925 )	2024-03-08 16:03:52 -05:00
Michael Schurter	3193ac204f	docs: skipping a major release is fine (#20075 ) Nomad has always placed an extremely high priority on backward compatibility. We have always aimed to support N-2 major releases and usually gone above and beyond that. The new https://www.hashicorp.com/long-term-support policy also mentions that N-2 is what we have always supported, so it's probably time for our docs to reflect that reality.	2024-03-06 08:57:12 -08:00
Jeff Boruszak	57af1cdcbf	docs: Consul Admin partition example (#20022 )	2024-02-28 09:04:04 -06:00
Tim Gross	45b2c34532	cni: add DNS set by CNI plugins to task configuration (#20007 ) CNI plugins may set DNS configuration, but this isn't threaded through to the task configuration so that we can write it to the `/etc/resolv.conf` file as needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're accessible from the taskrunner. Any DNS entries provided by the user will override these values. Fixes: https://github.com/hashicorp/nomad/issues/11102	2024-02-20 10:17:27 -05:00
Tim Gross	c1b5850473	docs: add warning not to enable Consul `tls.grpc.verify_incoming` (#19970 ) Consul does not support incoming TLS verification of Envoy. This failure results in hard-to-understand errors like `SSLV3_ALERT_BAD_CERTIFICATE` in the Envoy allocation logs. Leave a warning about this to users. Closes: https://github.com/hashicorp/nomad/issues/19772 Closes: https://github.com/hashicorp/nomad/issues/16854 Ref: https://github.com/hashicorp/consul/issues/13088	2024-02-14 08:56:35 -05:00
Seth Hoenig	37c497628c	docs: describe cloud environments in fingerprint denylist (#19952 ) This PR changes the example of the client config option "fingerprint.denylist" to include all the cloud environment fingerprinters. Each one contains a 2 second HTTP timeout to a metadata endpoint that does not exist if you are not in that particular cloud. When run in serial on startup, this results in an 8 second wait where nothing useful is happening. Closes #16727	2024-02-12 09:57:29 -06:00
Phil Renaud	41c783aec2	Noting action name restrictions, and correcting those of auth methods and roles (#19905 )	2024-02-08 12:01:22 -05:00
Luiz Aoqui	2a348ba714	docs: expand impact of `verify_https_client=false` (#19916 ) When Nomad is configured with `verify_https_client=false` endpoints that do not require an ACL token can be accessed without any other type of authentication. Expand the docs to mention this effect.	2024-02-08 10:55:40 -05:00
Luiz Aoqui	7daa854491	docs: remove duplicate entry for `upstreams.config` (#19877 )	2024-02-06 20:44:02 -05:00
Luiz Aoqui	5825cefe51	docs: remove Docker `cpuset_cpus` config (#19882 ) Nomad 1.7 refactored how CPU cores are assigned to tasks, making the Docker-specific `cpuset_cpus` configuration no longer used.	2024-02-06 10:51:16 -05:00
Juana De La Cuesta	120c3ca3c9	Add granular control of SELinux labels for host mounts (#19839 ) Add new configuration option on task's volume_mounts, to give a fine grained control over SELinux "z" label * Update website/content/docs/job-specification/volume_mount.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * fix: typo * func: make volume mount verification happen even on mounts with no volume --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-02-05 10:05:33 +01:00
Adrian Todorov	044eb0e048	docs: warnings about template dependencies, HCL2 clarifications (#19779 )	2024-01-19 14:07:15 -05:00
Vijesh	3b4afea974	docs: note script checks don't support some Consul options (#19770 ) Script checks don't support Consul's `success_before_passing`, `failures_before_critical`, or `failures_before_warning` because they're run by Nomad and not by Consul	2024-01-18 08:38:57 -05:00
Tom Davies	5a11a28cac	docs: updates link to Consul WLI migration docs (#19748 )	2024-01-17 09:57:02 -05:00
Luiz Aoqui	e1e80f383e	vault: add new `nomad setup vault -check` commmand (#19720 ) The new `nomad setup vault -check` commmand can be used to retrieve information about the changes required before a cluster is migrated from the deprecated legacy authentication flow with Vault to use only workload identities.	2024-01-12 15:48:30 -05:00
Luiz Aoqui	b2aa6ffd05	docs: fix Consul ACL requirements (#19721 ) Even with the new workload identitiy based flow the Nomad servers still need the `acl = "write"` permission in order to revoke service identity tokens.	2024-01-11 15:52:23 -05:00
Seth Hoenig	9410c519ff	drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration (#19599 ) * drivers/raw_exec: remove plumbing for ineffective no_cgroups configuration * fix tests	2024-01-11 08:20:15 -06:00
CJ	c9cd8480fa	docs: considerations for Stateful Workloads (#19077 ) Co-authored-by: Adrian Todorov <adrian.todorov@hashicorp.com>	2024-01-10 16:06:45 -05:00
Tim Gross	0935f443dc	vault: support allowing tokens to expire without refresh (#19691 ) Some users with batch workloads or short-lived prestart tasks want to derive a Vaul token, use it, and then allow it to expire without requiring a constant refresh. Add the `vault.allow_token_expiration` field, which works only with the Workload Identity workflow and not the legacy workflow. When set to true, this disables the client's renewal loop in the `vault_hook`. When Vault revokes the token lease, the token will no longer be valid. The client will also now automatically detect if the Vault auth configuration does not allow renewals and will disable the renewal loop automatically. Note this should only be used when a secret is requested from Vault once at the start of a task or in a short-lived prestart task. Long-running tasks should never set `allow_token_expiration=true` if they obtain Vault secrets via `template` blocks, as the Vault token will expire and the template runner will continue to make failing requests to Vault until the `vault_retry` attempts are exhausted. Fixes: https://github.com/hashicorp/nomad/issues/8690	2024-01-10 14:49:02 -05:00
Luiz Aoqui	5267eec3ad	vault: fix token revocation during workflow migration (#19689 ) When transitioning from the legacy token-based workflow to the new JWT workflow for Vault the previous code would instantiate a no-op Vault if the server configuration had a `default_identity` block. This no-op client returned an error for some of its operations were called, such as `LookupToken` and `RevokeTokens`. The original intention was that, in the new JWT workflow, none of these methods should be called, so returning an error could help surface potential bugs. But the `RevokeTokens` and `MarkForRevocation` methods _are_ called even in the JWT flow. When a leadership transition happens, the new server looks for unused Vault accessors from state and tries to revoke them. Similarly, the `RevokeTokens` method is called every time the `Node.UpdataStatus` and `Node.UpdateAlloc` RPCs are made by clients, as the Nomad server tries to find unused Vault tokens for the node/alloc. Since the new JWT flow does not require Nomad servers to contact Vault, calling `RevokeTokens` and `MarkForRevocation` is not able to complete without a Vault token, so this commit changes the logic to use the no-op Vault client when no token is configured. It also updates the client itself to not error if these methods are called, but to rather just log so operators can be made aware that there are Vault tokens created by Nomad that have not been force-expired. When migrating an existing cluster to the new workload identity based flow, Nomad operators must first upgrade the Nomad version without removing any of the existing Vault configuration. Doing so can prevent Nomad servers from managing and cleaning-up existing Vault tokens during a leadership transition and node or alloc updates. Operators must also resubmit all jobs with a `vault` block so they are updated with an `identity` for Vault. Skipping this step may cause allocations to fail if their Vault token expires (if, for example, the Nomad client stops running for TTL/2) or if they are rescheduled, since the new client will try to follow the legacy flow which will fail if the Nomad server configuration for Vault has already been updated to remove the Vault address and token.	2024-01-10 13:28:46 -05:00
Tim Gross	d3e5cae1eb	consul: support admin partitions (#19665 ) Add support for Consul Enterprise admin partitions. We added fingerprinting in https://github.com/hashicorp/nomad/pull/19485. This PR adds a `consul.partition` field. The expectation is that most users will create a mapping of Nomad node pool to Consul admin partition. But we'll also create an implicit constraint for the fingerprinted value. Fixes: https://github.com/hashicorp/nomad/issues/13139	2024-01-10 10:41:29 -05:00

1 2 3 4 5 ...

912 Commits