nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-05 18:05:42 +03:00

Author	SHA1	Message	Date
Brian McClain	b4cc5d88e7	docs: update install command for Fedora to match install page (#24870 )	2025-01-16 13:39:56 -05:00
James Rasell	03cbe7cd71	server: Fix error message format when detailing cluster metadata. (#24874 )	2025-01-16 13:54:42 +00:00
James Rasell	1ae9785f9b	agent: Fix a bug where all syslog lines are notice when using JSON (#24865 ) The agent syslog write handler was unable to handle JSON log lines correctly, meaning all syslog entries when using JSON log format showed as NOTICE level. This change adds a new handler to the Nomad agent which can parse JSON log lines and correctly understand the expected log level entry. The change also removes the use of a filter from the default log format handler. This is not needed as the logs are fed into the syslog handler via hclog, which is responsible for level filtering.	2025-01-16 07:23:08 +00:00
Tim Gross	46bd0b1716	dynamic host volume: set default capability (#24857 ) We can reduce the amount of volume specification configuration many users will need by setting a default capability on a dynamic host volume if none is set. The default capability will allow using the volume in read/write mode on its node, with no further restrictions except those that might be set in the jobspec.	2025-01-15 14:07:07 -05:00
Tim Gross	044784b2fb	dynamic host volumes: move node pool governance to placement filter (CE) (#24867 ) Enterprise governance checks happen after dynamic host volumes are placed, so if node pool governance is active and you don't set a node pool or node ID for a volume, it's possible to get a placement that fails node pool governance even though there might be other nodes in the cluster that would be valid placements. Move the node pool governance for host volumes into the placement path, so that we're checking a specific node pool when node pool or node ID are set, but otherwise filtering out candidate nodes by node pool. This changset is the CE version of ENT/2200. Ref: https://hashicorp.atlassian.net/browse/NET-11549 Ref: https://github.com/hashicorp/nomad-enterprise/pull/2200	2025-01-15 14:04:18 -05:00
Tim Gross	a292ecc621	dynamic host volumes: allow for node pool and plugin ID changes (#24851 ) Update dynamic host volume validation and update logic to allow for changes to the node pool and plugin ID. If the client's node pool changes we'll sync up the correct node pool for the volumes already placed on that client. We'll also allow the plugin ID to be changed to allow for new versions of plugins supporting the same volume over time.	2025-01-15 13:40:42 -05:00
James Rasell	75d0ac657e	ui: Fill service check background object for pending checks. (#24818 )	2025-01-15 15:27:10 +00:00
James Rasell	8d201a82fd	agent: Fixed a bug where syslog error messages marked as notice. (#24820 ) The mapping between Nomad log level identifiers and syslog priorities did not handle the error level string correctly.	2025-01-15 08:02:53 +00:00
James Rasell	689f935e0a	services: Support TLS Skip Verify within Nomad service checks. (#24781 ) Checks within a service using the Nomad provider can now utilise the `tls_skip_verify` parameter.	2025-01-15 07:39:39 +00:00
Michael Schurter	0438294f69	Merge pull request #24858 from hashicorp/post-1.9.5-release Post 1.9.5 release	2025-01-14 13:11:58 -08:00
Michael Schurter	925d2dbaed	actually update backport changelog	2025-01-14 12:56:37 -08:00
Michael Schurter	cd40bf9958	update changelog	2025-01-14 12:35:04 -08:00
hc-github-team-nomad-core	3eea4076c3	Prepare for next release	2025-01-14 12:31:19 -08:00
hc-github-team-nomad-core	b40200cefd	Generate files for 1.9.5 release	2025-01-14 12:31:18 -08:00
Tim Gross	6ea40cbfb2	E2E: dynamic host volumes test reliability (#24854 ) The nightly runs for E2E have been failing the recently added dynamic host volumes tests for a number of reasons: * Adding timing logs to the tests shows that it can take over 5s (the original test timeout) for the client fingerprint to show up on the client. This seems like a lot but seems to be host-dependent because it's much faster locally. Extend the timeout and leave in the timing logs so that we can keep an eye on this problem in the future. * The register test doesn't wait for the dispatched job to complete, and the dispatched job was actually broken when TLS was in use because we weren't using the Task API socket. Fix the jobspec for the dispatched job and add waiting for the dispatched allocation to be marked complete before checking for the volume on the server. I've also change both the mounter jobs to batch workloads, so that we don't have to wait 10s for the deployment to complete.	2025-01-14 12:26:31 -05:00
Tim Gross	ef366ee166	E2E: update .gitignore files to avoid committing runtime files (#24855 ) In #24694 we did a major refactoring of the E2E Terraform configuration. After deploying a cluster this morning, I noticed a few moved/removed files were not reflected in the .gitignore files. This changeset updates the .gitignore to have no unstaged files after applying.	2025-01-14 12:16:01 -05:00
Tim Gross	87f1427d9e	dynamic host volumes: capacity_max may be unset during register (#24850 ) When using the register workflow, `capacity_max` is ignored so is likely unset. If the volume is then updated later, the check we had for valid updates assumes that the value was previously. Only perform this check if the value is set.	2025-01-13 16:05:25 -05:00
Daniel Bennett	985eb53c65	dynamic host volumes: plugin spec tweaks (#24848 ) * prefix plugin env vars with DHV_ * add env: DHV_VOLUME_ID, DHV_PLUGIN_DIR * 5s timeout on fingerprint calls	2025-01-13 14:18:10 -06:00
Tim Gross	203a6533bb	API: host volume access modes should match list in structs package (#24838 ) We changed the list of access modes available for dynamic host volumes in #24705 but neglected to change them in the API package. Update the API package to match. Ref: https://github.com/hashicorp/nomad/pull/24705	2025-01-13 15:00:22 -05:00
Juana De La Cuesta	b29a3736a4	Update e2e infra provision to expect providers (#24694 ) * func: move infra provisionining to a module and remove providers * func: update paths * func: update more paths * func: update path inside bootstrap scrip * style: remove debug prints on bootstrap scripts * Delete e2e/terraform/csi/input/volume-efs.hcl * fix: update keys path to use module path instead pf root * fix: add missing headers * fix: update keys directory inside provision-nomad * style; format hcl files * Update compute.tf * Update e2e/terraform/main.tf Co-authored-by: Tim Gross <tgross@hashicorp.com> * Update e2e/terraform/provision-infra/compute.tf Co-authored-by: Tim Gross <tgross@hashicorp.com> * fix: update more paths * fix: fmt hcl files * func: final paths revision for running e2e locally * fix: make path of certs relative to module for the bootstrap * func: final paths revision for running e2e locally * Update network.tf * fix: fix typo and add success message * fix: remove the test name from token to avoid long names and use name for vol to avoid colisions * func: unify the uploads folder * func: make the uploads file one per cluster * func: Add outputs with all data necessary to connect to the cluster * fix: make nomad token a sensitive output * Update bootstrap-nomad.sh --------- Co-authored-by: Tim Gross <tgross@hashicorp.com>	2025-01-13 15:59:40 +01:00
Tim Gross	3a11a0b1e1	quotas: refactor storage limit specification (#24785 ) In anticipation of having quotas for dynamic host volumes, we want the user experience of the storage limits to feel integrated with the other resource limits. This is currently prevented by reusing the `Resources` type instead of having a specific type for `QuotaResources`. Update the quota limit/usage types to use a `QuotaResources` that includes a new storage resources quota block. The wire format for the two types are compatible such that we can migrate the existing variables limit in the FSM. Also fixes improper parallelism in the quota init test where we change working directory to avoid file write conflicts but this breaks when multiple tests are executed in the same process. Ref: https://github.com/hashicorp/nomad-enterprise/pull/2096	2025-01-13 09:25:00 -05:00
Tim Gross	cca9a5320d	testing: fix test flake in dynamic host volume client tests (#24836 ) The output of `GetDynamicHostVolumes` is a slice but that slice is constructed from iterating over a map and isn't sorted. Sort the output in the test to eliminate a test flake.	2025-01-10 14:48:05 -05:00
Aimee Ukasick	ffb34319d5	Docs SEO: Update Configuration section to improve search (#24759 ) * Docs SEO: Update Configuration section to improve search engine opt CE-775 * Add enterprise only back to audit * Update descriptions and add intro paragraph * Fix typo * replace "below" and "see" * Apply suggestions from code review Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com>	2025-01-10 11:05:23 -06:00
Tim Gross	a7b5970d49	dynamic host volumes: cleanup comments (#24830 ) Some comment cleanups as we're wrapping up dynamic host volumes work: * We're not going to implement mount_options for host volumes, as the dynamic host volumes don't have the equivalent of the stage/publish phase that CSI volumes do. Users who want that sort of thing will pass them as `parameter` field during volume create/register. * The scheduler feasibility check prevents a dynamic host volume being claimed by a job in the wrong namespace, but the comment incorrectly identifies that code path as only being about the race between fingerprint and delete. Update the comment to make the intent clear so that we don't accidentally remove this behavior in the future.	2025-01-10 11:30:47 -05:00
Michael Smithhisler	606ce9dd90	deps: upgrade aws-sdk-go from v1 to v2 (#24720 )	2025-01-09 17:27:19 -05:00
Mitch Pronschinske	b050c73a6d	Update who-uses-nomad.mdx (#24815 ) * Update who-uses-nomad.mdx Our new contract with Roblox states that we can't mention anywhere on our sites that they use us. * Update who-uses-nomad.mdx Edited the sentence above the companies list to more accurately reflect them. Also added Target to the list with a link to their case study.	2025-01-09 09:05:26 -06:00
Tim Gross	997358d855	E2E: dynamic host volumes workflows (#24816 ) Initial end-to-end tests for dynamic host volumes. This includes tests for two workflows: * One where a dynamic host volume is created by a plugin and then mounted by a job. * Another where a dynamic host volume is created out-of-band and registered by a job, then mounted by another job. This changeset also moves the existing `volumes` E2E test package to the better-named `volume_mounts`. Ref: https://hashicorp.atlassian.net/browse/NET-11551	2025-01-09 08:41:22 -05:00
Tim Gross	4a65b21aab	dynamic host volumes: send register to client for fingerprint (#24802 ) When we register a volume without a plugin, we need to send a client RPC so that the node fingerprint can be updated. The registered volume also needs to be written to client state so that we can restore the fingerprint after a restart. Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2025-01-08 16:58:58 -05:00
Seth Hoenig	2bfe817721	Post 1.9.4 release (#24811 ) * Generate files for 1.9.4 release * Prepare for next release * Merge release 1.9.4 files --------- Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>	2025-01-08 09:36:22 -06:00
Ruben Nic	894b32f178	Merge pull request #24791 from hashicorp/rm/version-bump-prettier Add override to force prettier 5.0	2025-01-08 10:01:48 -05:00
Piotr Kazmierczak	7726ae68c6	client: move 'waiting for previous alloc to terminate' log messages to info (#24804 )	2025-01-08 15:44:35 +01:00
James Rasell	359571df01	e2e: Account for non-default region in Prometheus scrape config. (#24807 )	2025-01-08 14:08:17 +00:00
Michael Smithhisler	34a34e7233	plugins: validate logmon process during reattach (#24798 )	2025-01-08 08:50:33 -05:00
dependabot[bot]	679642c8bf	chore(deps): bump github.com/Masterminds/sprig/v3 from 3.2.3 to 3.3.0 (#24674 ) Bumps [github.com/Masterminds/sprig/v3](https://github.com/Masterminds/sprig) from 3.2.3 to 3.3.0. - [Release notes](https://github.com/Masterminds/sprig/releases) - [Changelog](https://github.com/Masterminds/sprig/blob/master/CHANGELOG.md) - [Commits](https://github.com/Masterminds/sprig/compare/v3.2.3...v3.3.0) --- updated-dependencies: - dependency-name: github.com/Masterminds/sprig/v3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-08 12:27:12 +01:00
dependabot[bot]	08c72b0e07	chore(deps): bump actions/setup-go from 5.1.0 to 5.2.0 (#24670 ) Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5.1.0 to 5.2.0. - [Release notes](https://github.com/actions/setup-go/releases) - [Commits](`41dfa10bad...3041bf56c9`) --- updated-dependencies: - dependency-name: actions/setup-go dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-08 12:26:32 +01:00
Juana De La Cuesta	2eb2b6c739	fix: update the dnsconfig script to handle multiple interfaces (#24800 )	2025-01-07 21:12:18 +01:00
Phil Renaud	ab39f198ff	[ui] Show ALL regions' leaders when viewing servers route (#24723 ) * Looks up all regions' leaders when viewing servers route * Tests for multi-region leadership badges and css same-line fix	2025-01-07 12:35:04 -05:00
Michael Schurter	1610f18500	test: set AuthToken in tests to match Client code (#24792 ) tl;dr - runtime code is fine but tests should match reality The Nomad Client Agent is the only consumer of the `Node.Derive{SI,Vault}Token` RPCs, therefore tests of the RPCs should match Nomad Client behavior. - DeriveVaultToken code: `a9ee66a6ef/client/client.go (L2904-L2917)` - DeriveSIToken code: `a9ee66a6ef/client/client.go (L2988-L2997)` Both of those client code paths include the Node SecretID in both the request's SecretID field as well as the embedded `QueryOptions.AuthToken` field. This patch updates server tests to match that behavior. The tests pass either way.	2025-01-07 09:28:05 -08:00
Tim Gross	024c504a1e	dynamic host volumes: require node ID on register (#24795 ) When registering a host volume created out-of-band, the volume will have been created on a specific node. Require the node ID field to be set. Ref: https://github.com/hashicorp/nomad/pull/24789#discussion_r1904690799	2025-01-07 11:24:45 -05:00
Tim Gross	08a6f870ad	cni: use check command when restoring from restart (#24658 ) When the Nomad client restarts and restores allocations, the network namespace for an allocation may exist but no longer be correctly configured. For example, if the host is rebooted and the task was a Docker task using a pause container, the network namespace may be recreated by the docker daemon. When we restore an allocation, use the CNI "check" command to verify that any existing network namespace matches the expected configuration. This requires CNI plugins of at least version 1.2.0 to avoid a bug in older plugin versions that would cause the check to fail. If the check fails, destroy the network namespace and try to recreate it from scratch once. If that fails in the second pass, fail the restore so that the allocation can be recreated (rather than silently having networking fail). This should fix the gap left #24650 for Docker task drivers and any other drivers with the `MustInitiateNetwork` capability. Fixes: https://github.com/hashicorp/nomad/issues/24292 Ref: https://github.com/hashicorp/nomad/pull/24650	2025-01-07 09:38:39 -05:00
Piotr Kazmierczak	0906f788f0	keyring: warn if removing a key that was used for encrypting variables (#24766 ) Adds an additional check in the Keyring.Delete RPC to make sure we're not trying to delete a key that's been used to encrypt a variable. It also adds a -force flag for the CLI/API to sidestep that check.	2025-01-07 10:15:02 +01:00
James Rasell	0726e4cc3e	driver/docker: Fix container CPU stats collection (#24768 ) The recent change to collection via a "one-shot" Docker API call did not update the stream boolean argument. This results in the PreCPUStats values being zero and therefore breaking the CPU calculations which rely on this data. The base fix is to update the passed boolean parameter to match the desired non-streaming behaviour. The non-streaming API call correctly returns the PreCPUStats data which can be seen in the added unit test. The most recent change also modified the behaviour of the collectStats go routine, so that any error encountered results in the routine exiting. In the event this was a transient error, the container will continue to run, however, no stats will be collected until the task is stopped and replaced. This PR reverts the behaviour, so that an error encountered during a stats collection run results in the error being logged but the collection process continuing with a backoff used.	2025-01-07 07:42:31 +00:00
Daniel Bennett	a9ee66a6ef	dynamic host volumes: unique volume name per node (#24748 ) a node can have only one volume with a given name. the scheduler prevents duplicates, but can only do so after the server knows about the volume. this prevents multiple concurrent creates being called faster than the fingerprint/heartbeat interval. users may still modify an existing volume only if they set the `id` in the volume spec and re-issue `nomad volume create` if a static vol is added to config with a name already being used by a dynamic volume, the dynamic takes precedence, but log a warning.	2025-01-06 15:37:20 -06:00
Robert Main	ca0deab87c	Add override to force prettier 5.0 This fixes an issue with prettier incorrectly causing pre-commit hooks to fail	2025-01-06 16:07:51 -05:00
Daniel Bennett	459453917e	dynamic host volumes: client-side tests, comments, tidying (#24747 )	2025-01-06 13:20:07 -06:00
dependabot[bot]	48467ba5a8	chore(deps): bump google.golang.org/grpc from 1.68.0 to 1.69.2 (#24773 ) * chore(deps): bump google.golang.org/grpc from 1.68.0 to 1.69.2 Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.68.0 to 1.69.2. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.68.0...v1.69.2) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * upgrade packages for compatibility --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2025-01-06 16:08:27 +01:00
Charles Z.	f7b12dc54e	add noswap to secretdir tmpfs (#24645 )	2025-01-06 09:44:43 -05:00
dependabot[bot]	90d1561ae5	chore(deps): bump github.com/docker/docker (#24772 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.3.1+incompatible to 27.4.1+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v27.3.1...v27.4.1) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-06 09:04:44 +01:00
dependabot[bot]	a085696c19	chore(deps): bump github.com/zclconf/go-cty from 1.13.0 to 1.16.0 (#24771 ) Bumps [github.com/zclconf/go-cty](https://github.com/zclconf/go-cty) from 1.13.0 to 1.16.0. - [Release notes](https://github.com/zclconf/go-cty/releases) - [Changelog](https://github.com/zclconf/go-cty/blob/main/CHANGELOG.md) - [Commits](https://github.com/zclconf/go-cty/compare/v1.13.0...v1.16.0) --- updated-dependencies: - dependency-name: github.com/zclconf/go-cty dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-06 09:03:22 +01:00
Vincent Ducamps	6469b59a0a	docker: Fix a bug where images with port number and no tags weren't parsed correctly	2025-01-03 11:38:43 +01:00

1 2 3 4 5 ...

26558 Commits