nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Michael Smithhisler	5c4d0e923d	consul: Remove legacy token based authentication workflow (#25217 )	2025-03-05 15:38:11 -05:00
Michael Smithhisler	f2b761f17c	disconnected: removes deprecated disconnect fields (#25284 ) The group level fields stop_after_client_disconnect, max_client_disconnect, and prevent_reschedule_on_lost were deprecated in Nomad 1.8 and replaced by field in the disconnect block. This change removes any logic related to those deprecated fields. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-03-05 14:46:02 -05:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
Daniel Bennett	a036b75aef	api: new dispatch endpoint sends body as Payload (#24381 ) this opens up dispatching parameterized jobs by systems that do not allow modifying what http request body they send e.g. these two things are equal: POST '{"Payload": "'"$(base64 <<< "hello")"'"}' /v1/job/my-job/dispatch POST 'hello' /v1/job/my-job/dispatch/payload	2024-11-07 10:12:29 -06:00
Martijn Vegter	6236f354a5	consul: add support for service weight (#24186 )	2024-10-25 11:21:38 -04:00
Seth Hoenig	f1ce127524	jobspec: add a chown option to artifact block (#24157 ) * jobspec: add a chown option to artifact block This PR adds a boolean 'chown' field to the artifact block. It indicates whether the Nomad client should chown the downloaded files and directories to be owned by the task.user. This is useful for drivers like raw_exec and exec2 which are subject to the host filesystem user permissions structure. Before, these drivers might not be able to use or manage the downloaded artifacts since they would be owned by the root user on a typical Nomad client configuration. * api: no need for pointer of chown field	2024-10-11 11:30:27 -05:00
Phil Renaud	e206993d49	Feature: Golden Versions (#24055 ) * TaggedVersion information in structs, rather than job_endpoint (#23841) * TaggedVersion information in structs, rather than job_endpoint * Test for taggedVersion description length * Some API plumbing * Tag and Untag job versions (#23863) * Tag and Untag at API level on down, but am I unblocking the wrong thing? * Code and comment cleanup * Unset methods generally now I stare long into the namespace abyss * Namespace passes through with QueryOptions removed from a write requesting struct * Comment and PR review cleanup * Version back to VersionStr * Generally consolidate unset logic into apply for version tagging * Addressed some PR comments * Auth check and RPC forwarding * uint64 instead of pointer for job version after api layer and renamed copy * job tag command split into apply and unset * latest-version convenience handling moved to CLI command level * CLI tests for tagging/untagging * UI parts removed * Add to job table when unsetting job tag on latest version * Vestigial no more * Compare versions by name and version number with the nomad history command (#23889) * First pass at passing a tagname and/or diff version to plan/versions requests * versions API now takes compare_to flags * Job history command output can have tag names and descriptions * compare_to to diff-tag and diff-version, plus adding flags to history command * 0th version now shows a diff if a specific diff target is requested * Addressing some PR comments * Simplify the diff-appending part of jobVersions and hide None-type diffs from CLI * Remove the diff-tag and diff-version parts of nomad job plan, with an eye toward making them a new top-level CLI command soon * Version diff tests * re-implement JobVersionByTagName * Test mods and simplification * Documentation for nomad job history additions * Prevent pruning and reaping of TaggedVersion jobs (#23983) tagged versions should not count against JobTrackedVersions i.e. new job versions being inserted should not evict tagged versions and GC should not delete a job if any of its versions are tagged Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> * [ui] Version Tags on the job versions page (#24013) * Timeline styles and their buttons modernized, and tags added * styled but not yet functional version blocks * Rough pass at edit/unedit UX * Styles consolidated * better UX around version tag crud, plus adapter and serializers * Mirage and acceptance tests * Modify percy to not show time-based things --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> * Job revert command and API endpoint can take a string version tag name (#24059) * Job revert command and API endpoint can take a string version tag name * RevertOpts as a signature-modified alternative to Revert() * job revert CLI test * Version pointers in endpoint tests * Dont copy over the tag when a job is reverted to a version with a tag * Convert tag name to version number at CLI level * Client method for version lookup by tag * No longer double-declaring client * [ui] Add tag filter to the job versions page (#24064) * Rough pass at the UI for version diff dropdown * Cleanup and diff fetching via adapter method * TaggedVersion now VersionTag (#24066) --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2024-09-25 19:59:16 -04:00
nicoche	ffcb72bfe3	api: Add Notes field to service checks (#22397 ) Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2024-06-10 16:59:49 +02:00
Phil Renaud	e8b77fcfa0	[ui] Jobspec UI block: Descriptions and Links (#18292 ) * Hacky but shows links and desc * markdown * Small pre-test cleanup * Test for UI description and link rendering * JSON jobspec docs and variable example job get UI block * Jobspec documentation for UI block * Description and links moved into the Title component and made into Helios components * Marked version upgrade * Allow links without a description and max description to 1000 chars * Node 18 for setup-js * markdown sanitization * Ui to UI and docs change * Canonicalize, copy and diff for job.ui * UI block added to testJob for structs testing * diff test * Remove redundant reset * For readability, changing the receiving pointer of copied job variables * TestUI endpiont conversion tests * -require +must * Nil check on Links * JobUIConfig.Links as pointer --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2024-05-22 15:00:45 -04:00
Tim Gross	c9fd93c772	connect: support `volume_mount` blocks for sidecar task overrides (#20575 ) Users can override the default sidecar task for Connect workloads. This sidecar task might need access to certificate stores on the host. Allow adding the `volume_mount` block to the sidecar task override. Also fixes a bug where `volume_mount` blocks would not appear in plan diff outputs. Fixes: https://github.com/hashicorp/nomad/issues/19786	2024-05-14 12:49:37 -04:00
Tim Gross	bdf3ff301e	jobspec: add support for destination partition to `upstream` block (#20167 ) Adds support for specifying a destination Consul admin partition in the `upstream` block. Fixes: https://github.com/hashicorp/nomad/issues/19785	2024-03-22 16:15:22 -04:00
Tim Gross	10dd738a03	jobspec: update `gateway.ingress.service` Consul API fields (#20176 ) Add support for further configuring `gateway.ingress.service` blocks to bring this block up-to-date with currently available Consul API fields (except for namespace and admin partition, which will need be handled under a different PR). These fields are sent to Consul as part of the job endpoint submission hook for Connect gateways. Co-authored-by: Horacio Monsalvo <horacio.monsalvo@southworks.com>	2024-03-22 13:50:48 -04:00
Juana De La Cuesta	20cfbc82d3	Introduces `Disconnect` block into the `TaskGroup` configuration (#19886 ) This PR is the first on two that will implement the new Disconnect block. In this PR the new block is introduced to be backwards compatible with the fields it will replace. For more information refer to this RFC and this ticket.	2024-02-19 16:41:35 +01:00
Tim Gross	0935f443dc	vault: support allowing tokens to expire without refresh (#19691 ) Some users with batch workloads or short-lived prestart tasks want to derive a Vaul token, use it, and then allow it to expire without requiring a constant refresh. Add the `vault.allow_token_expiration` field, which works only with the Workload Identity workflow and not the legacy workflow. When set to true, this disables the client's renewal loop in the `vault_hook`. When Vault revokes the token lease, the token will no longer be valid. The client will also now automatically detect if the Vault auth configuration does not allow renewals and will disable the renewal loop automatically. Note this should only be used when a secret is requested from Vault once at the start of a task or in a short-lived prestart task. Long-running tasks should never set `allow_token_expiration=true` if they obtain Vault secrets via `template` blocks, as the Vault token will expire and the template runner will continue to make failing requests to Vault until the `vault_retry` attempts are exhausted. Fixes: https://github.com/hashicorp/nomad/issues/8690	2024-01-10 14:49:02 -05:00
Mike Nomitch	31f4296826	Adds support for failures before warning to Consul service checks (#19336 ) Adds support for failures before warning and failures before critical to the automatically created Nomad client and server services in Consul	2023-12-14 11:33:31 -08:00
Juana De La Cuesta	cf539c405e	Add a new parameter to avoid starting a replacement for lost allocs (#19101 ) This commit introduces the parameter preventRescheduleOnLost which indicates that the task group can't afford to have multiple instances running at the same time. In the case of a node going down, its allocations will be registered as unknown but no replacements will be rescheduled. If the lost node comes back up, the allocs will reconnect and continue to run. In case of max_client_disconnect also being enabled, if there is a reschedule policy, an error will be returned. Implements issue #10366 Co-authored-by: Dom Lavery <dom@circleci.com> Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-12-06 12:28:42 +01:00
James Rasell	0f0b9a1a3c	action: add job action name validation (#19145 )	2023-11-22 08:02:49 +00:00
Michael Schurter	c4ae91f8be	Fix WorkloadIdentity.TTL handling, jobspec2 testing, and hcl1 vs 2 parsing (#19024 ) * make the little dots consistent * don't trim delimiter as that over matches * test jobspec2 package * copy api/WorkloadIdentity.TTL -> structs * test ttl parsing * fix hcl1 v 2 parsing mismatch * make jobspec(1) tests match jobspec2 tests	2023-11-08 09:01:16 -08:00
Michael Schurter	e49ca3c431	identity: Implement `change_mode` (#18943 ) * identity: support change_mode and change_signal wip - just jobspec portion * test struct * cleanup some insignificant boogs * actually implement change mode * docs tweaks * add changelog * test identity.change_mode operations * use more words in changelog * job endpoint tests * address comments from code review --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-11-01 09:41:11 -05:00
Phil Renaud	8902afe651	Nomad Actions (#18794 ) * Scaffolding actions (#18639) * Task-level actions for job submissions and retrieval * FIXME: Temporary workaround to get ember dev server to pass exec through to 4646 * Update api/tasks.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * Update command/agent/job_endpoint.go Co-authored-by: Tim Gross <tgross@hashicorp.com> * Diff and copy implementations * Action structs get their own file, diff updates to behave like our other diffs * Test to observe actions changes in a version update * Tests migrated into structs/diff_test and modified with PR comments in mind * APIActionToSTructsAction now returns a new value * de-comment some plain parts, remove unused action lookup * unused param in action converter --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> * New endpoint: job/:id/actions (#18690) * unused param in action converter * backing out of parse_job level and moved toward new endpoint level * Adds taskName and taskGroupName to actions at job level * Unmodified job mock actions tests * actionless job test * actionless job test * Multi group multi task actions test * HTTP method check for GET, cleaner errors in job_endpoint_test * decomment * Actions aggregated at job model level (#18733) * Removal of temporary fix to proxy to 4646 * Run Action websocket endpoint (#18760) * Working demo for review purposes * removal of cors passthru for websockets * Remove job_endpoint-specific ws handlers and aimed at existing alloc exec handlers instead * PR comments adressed, no need for taskGroup pass, better group and task lookups from alloc * early return in action validate and removed jobid from req args per PR comments * todo removal, we're checking later in the rpc * boolean style change on tty * Action CLI command (#18778) * Action command init and stuck-notes * Conditional reqpath to aim at Job action endpoint * De-logged * General CLI command cleanup, observe namespace, pass action as string, get random alloc w group adherence * tab and varname cleanup * Remove action param from Allocations().Exec calls * changelog * dont nil-check acl --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-10-20 13:05:55 -04:00
Seth Hoenig	83720740f5	core: plumbing to support numa aware scheduling (#18681 ) * core: plumbing to support numa aware scheduling * core: apply node resources compatibility upon fsm rstore Handle the case where an upgraded server dequeus an evaluation before a client triggers a new fingerprint - which would be needed to cause the compatibility fix to run. By running the compat fix on restore the server will immediately have the compatible pseudo topology to use. * lint: learn how to spell pseudo	2023-10-19 15:09:30 -05:00
Tim Gross	d0957eb109	Consul: agent config updates for WI (#18774 ) This changeset makes two changes: * Removes the `consul.use_identity` field from the agent configuration. This behavior is properly covered by the presence of `consul.service_identity` / `consul.task_identity` blocks. * Adds a `consul.task_auth_method` and `consul.service_auth_method` fields to the agent configuration. This allows the cluster administrator to choose specific Consul Auth Method names for their environment, with a reasonable default.	2023-10-17 14:42:14 -04:00
Tim Gross	5001bf4547	consul: use constant instead of "default" literal (#18611 ) Use the constant `structs.ConsulDefaultCluster` instead of the string literal "default", as we've done for Vault.	2023-09-28 16:50:21 -04:00
Luiz Aoqui	868aba57bb	vault: update identity name to start with `vault_` (#18591 ) * vault: update identity name to start with `vault_` In the original proposal, workload identities used to derive Vault tokens were expected to be called just `vault`. But in order to support multiple Vault clusters it is necessary to associate identities with specific Vault cluster configuration. This commit implements a new proposal to have Vault identities named as `vault_<cluster>`.	2023-09-27 15:53:28 -03:00
James Rasell	d923fc554d	consul/connect: add new fields to Consul Connect upstream block (#18430 ) Co-authored-by: Horacio Monsalvo <horacio.monsalvo@southworks.com>	2023-09-11 16:02:52 +01:00
Tim Gross	7cdd592809	jobspec: support `cluster` field for Vault block (#18408 ) This field supports the upcoming ENT-only multiple Vault clusters feature. The job validation and mutation hooks will come in a separate PR. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-09-07 10:15:28 -04:00
Tim Gross	7863d7bcbb	jobspec: support `cluster` field for Consul and Service blocks (#18409 ) This field supports the upcoming ENT-only multiple Consul clusters feature. The job validation and mutation hooks will come in a separate PR. Ref: https://github.com/hashicorp/team-nomad/issues/404	2023-09-07 09:48:49 -04:00
Андрей Неустроев	3e61b3a37d	Add multiple times in periodic jobs (#17858 )	2023-08-22 15:42:31 -04:00
Piotr Kazmierczak	9fa39eb829	jobspec: add nomad_service field and identity block (#18239 ) This PR introduces updates to the jobspec required for workload identity support for services. --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-08-21 20:07:47 +02:00
Luiz Aoqui	196213c451	jobspec: add `role` to `vault` (#18257 )	2023-08-18 15:29:02 -04:00
hashicorp-copywrite[bot]	a9d61ea3fd	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:29 -05:00
Gerard Nguyen	9e98d694a6	feature: Add new field render_templates on restart block (#18054 ) This feature is necessary when user want to explicitly re-render all templates on task restart. E.g. to fetch all new secrets from Vault, even if the lease on the existing secrets has not been expired.	2023-07-28 11:53:32 -07:00
Ville Vesilehto	2c463bb038	chore(lint): use Go stdlib variables for HTTP methods and status codes (#17968 )	2023-07-26 15:28:09 +01:00
grembo	6f04b91912	Add `disable_file` parameter to job's `vault` stanza (#13343 ) This complements the `env` parameter, so that the operator can author tasks that don't share their Vault token with the workload when using `image` filesystem isolation. As a result, more powerful tokens can be used in a job definition, allowing it to use template stanzas to issue all kinds of secrets (database secrets, Vault tokens with very specific policies, etc.), without sharing that issuing power with the task itself. This is accomplished by creating a directory called `private` within the task's working directory, which shares many properties of the `secrets` directory (tmpfs where possible, not accessible by `nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the container. If the `disable_file` parameter is set to `false` (its default), the Vault token is also written to the NOMAD_SECRETS_DIR, so the default behavior is backwards compatible. Even if the operator never changes the default, they will still benefit from the improved behavior of Nomad never reading the token back in from that - potentially altered - location.	2023-06-23 15:15:04 -04:00
Luiz Aoqui	4f7c38b2a7	node pools: namespace integration (#17562 ) Add structs and fields to support the Nomad Pools Governance Enterprise feature of controlling node pool access via namespaces. Nomad Enterprise allows users to specify a default node pool to be used by jobs that don't specify one. In order to accomplish this, it's necessary to distinguish between a job that explicitly uses the `default` node pool and one that did not specify any. If the `default` node pool is set during job canonicalization it's impossible to do this, so this commit allows a job to have an empty node pool value during registration but sets to `default` at the admission controller mutator. In order to guarantee state consistency the state store validates that the job node pool is set and exists before inserting it.	2023-06-16 16:30:22 -04:00
Tim Gross	2d059bbf22	node pools: add `node_pool` field to job spec (#17379 ) This changeset only adds the `node_pool` field to the jobspec, and ensures that it gets picked up correctly as a change. Without the rest of the implementation landed yet, the field will be ignored.	2023-06-01 16:08:55 -04:00
Tim Gross	2aa3c746c4	logs: fix missing allocation logs after update to Nomad 1.5.4 (#17087 ) When the server restarts for the upgrade, it loads the `structs.Job` from the Raft snapshot/logs. The jobspec has long since been parsed, so none of the guards around the default value are in play. The empty field value for `Enabled` is the zero value, which is false. This doesn't impact any running allocation because we don't replace running allocations when either the client or server restart. But as soon as any allocation gets rescheduled (ex. you drain all your clients during upgrades), it'll be using the `structs.Job` that the server has, which has `Enabled = false`, and logs will not be collected. This changeset fixes the bug by adding a new field `Disabled` which defaults to false (so that the zero value works), and deprecates the old field. Fixes #17076	2023-05-04 16:01:18 -04:00
Tim Gross	30bc456f03	logs: allow disabling log collection in jobspec (#16962 ) Some Nomad users ship application logs out-of-band via syslog. For these users having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow disabling the logmon and pointing the task's stdout/stderr to /dev/null. This changeset is the first of several incremental improvements to log collection short of full-on logging plugins. The next step will likely be to extend the internal-only task driver configuration so that cluster administrators can turn off log collection for the entire driver. --- Fixes: #11175 Co-authored-by: Thomas Weber <towe75@googlemail.com>	2023-04-24 10:00:27 -04:00
Seth Hoenig	2c44cbb001	api: enable support for setting original job source (#16763 ) * api: enable support for setting original source alongside job This PR adds support for setting job source material along with the registration of a job. This includes a new HTTP endpoint and a new RPC endpoint for making queries for the original source of a job. The HTTP endpoint is /v1/job/<id>/submission?version=<version> and the RPC method is Job.GetJobSubmission. The job source (if submitted, and doing so is always optional), is stored in the job_submission memdb table, separately from the actual job. This way we do not incur overhead of reading the large string field throughout normal job operations. The server config now includes job_max_source_size for configuring the maximum size the job source may be, before the server simply drops the source material. This should help prevent Bad Things from happening when huge jobs are submitted. If the value is set to 0, all job source material will be dropped. * api: avoid writing var content to disk for parsing * api: move submission validation into RPC layer * api: return an error if updating a job submission without namespace or job id * api: be exact about the job index we associate a submission with (modify) * api: reword api docs scheduling * api: prune all but the last 6 job submissions * api: protect against nil job submission in job validation * api: set max job source size in test server * api: fixups from pr	2023-04-11 08:45:08 -05:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Horacio Monsalvo	5957880112	connect: add meta on ConsulSidecarService (#16705 ) Co-authored-by: Sol-Stiep <sol.stiep@southworks.com>	2023-03-30 16:09:28 -04:00
Alessio Perugini	365ccf4377	Allow configurable range of Job priorities (#16084 )	2023-02-17 09:23:13 -05:00
Jorge Marey	340ad2db58	Rename fields on proxyConfig (#15541 ) * Change api Fields for expose and paths * Add changelog entry * changelog: add deprecation notes about connect fields * api: minor style tweaks --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:31:16 -06:00
Piotr Kazmierczak	949a6f60c7	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
Charlie Voiselle	52a254ba22	template: error on missing key (#15141 ) * Support error_on_missing_value for templates * Update docs for template stanza	2022-11-04 13:23:01 -04:00
Piotr Kazmierczak	34e4b080f6	template: custom change_mode scripts (#13972 ) This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707	2022-08-24 17:43:01 +02:00
Luiz Aoqui	934bafb922	template: use pointer values for gid and uid (#14203 ) When a Nomad agent starts and loads jobs that already existed in the cluster, the default template uid and gid was being set to 0, since this is the zero value for int. This caused these jobs to fail in environments where it was not possible to use 0, such as in Windows clients. In order to differentiate between an explicit 0 and a template where these properties were not set we need to use a pointer.	2022-08-22 16:25:49 -04:00
Michael Schurter	01648e615a	client: fix data races in config handling (#14139 ) Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked. This change takes the following approach to safely handling configs in the client: 1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex 2. Since the mutex only protects the config pointer itself and not fields inside the Config struct: all config mutation must be done on a copy of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates. 3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion. 4. To facilitate deep copying I made an internally backward incompatible API change: our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"	2022-08-18 16:32:04 -07:00
Piotr Kazmierczak	c4be2c6078	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Seth Hoenig	47d44d62bb	cleanup: consul mesh gateway type need not be pointer This PR changes the use of structs.ConsulMeshGateway to value types instead of via pointers. This will help in a follow up PR where we cleanup a lot of custom comparison code with helper functions instead.	2022-08-13 11:26:58 -05:00

1 2 3 4 5

209 Commits