nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-06 18:35:44 +03:00

Author	SHA1	Message	Date
Piotr Kazmierczak	2556ff9a0e	deps: update generate.sh script to msgpack v2 (#20186 )	2024-03-22 16:56:29 +01:00
Tim Gross	15162917c1	cni: fix regression in falling back to DNS owned by `dockerd` (#20189 ) In #20007 we fixed a bug where the DNS configuration set by CNI plugins was not threaded through to the task configuration. This resulted in a regression where a DNS override set by `dockerd` was not respected for `bridge` mode networking. Our existing handling of CNI DNS incorrectly assumed that the DNS field would be empty, when in fact it contains a single empty DNS struct. Handle this case correctly by checking whether the DNS struct we get back from CNI has any nameservers, and ignore it if it doesn't. Expand test coverage of this case. Fixes: https://github.com/hashicorp/nomad/issues/20174	2024-03-22 10:54:16 -04:00
Seth Hoenig	c36db1b005	drivers/testutil: set full filepath for envs when using unveil fs isolation (#20187 )	2024-03-22 09:46:17 -05:00
Michael Schurter	23e4b7c9d2	Upgrade go-msgpack to v2 (#20173 ) Replaces #18812 Upgraded with: ``` find . -name '.go' -exec sed -i s/"github.com\/hashicorp\/go-msgpack\/codec"/"github.com\/hashicorp\/go-msgpack\/v2\/codec/" '{}' ';' find . -name '.go' -exec sed -i s/"github.com\/hashicorp\/net-rpc-msgpackrpc"/"github.com\/hashicorp\/net-rpc-msgpackrpc\/v2/" '{}' ';' go get go get -v -u github.com/hashicorp/raft-boltdb/v2 go get -v github.com/hashicorp/serf@5d32001edfaa18d1c010af65db707cdb38141e80 ``` see https://github.com/hashicorp/go-msgpack/releases/tag/v2.1.0 for details	2024-03-21 11:44:23 -07:00
Luiz Aoqui	b5573b7470	docs: fix `invoke_scheduler` metrics (#20172 )	2024-03-21 10:57:30 -04:00
Tim Gross	7b9bce2d08	config: fix `client.template` config merging with defaults (#20165 ) When loading the client configuration, the user-specified `client.template` block was not properly merged with the default values. As a result, if the user set any `client.template` field, all the other field defaulted to their zero values instead of the documented defaults. This changeset: * Adds the missing `Merge` method for the client template config and ensures it's called. * Makes a single source of truth for the default template configuration, instead of two different constructors. * Extends the tests to cover the merge of a partial block better. Fixes: https://github.com/hashicorp/nomad/issues/20164	2024-03-20 10:18:56 -04:00
Juana De La Cuesta	56bf253474	Add docs for disconnected block (#20147 ) Expand the job settings to include the disconnect block and set as deprecated the fields that will be replaced by it.	2024-03-20 10:08:16 +01:00
Charlie Voiselle	7b27bc344b	[refactor] Move task directory destroy logic from alloc_dir.go to task_dir.go (#20006 ) * Move task directory destroy logic from alloc_dir to task_dir * Update errors to wrap error cause * Use constants for file permissions * Make multierror handling consistent. * Make helpers for directory creation * Move mount dir unlink to task_dir Unlink method * Make constant for file mode 710 Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2024-03-19 13:49:09 -04:00
Tim Gross	dc39c20e66	docs: make recommendation for collection interval vs scrape interval (#20056 ) Metrics tools that "pull" metrics, such as Prometheus, have a configurable interval for how frequently they scrape metrics. This should be greater or equal to the Nomad `telemetry.collection_interval` to avoid re-scraping metrics that cannot have been updated in that interval. Fixes: https://github.com/hashicorp/nomad/issues/20055	2024-03-19 08:56:29 -04:00
Charlie Voiselle	4cc90c0b22	Fix LeadershipTransfer tests (#20154 ) Multi-node Nomad clusters under test must use the RPC calls to bootstrap the ACL subsystem. The original implementation of the testcluster tried naively supersizing the single node behavior in TestRaftRemovePeer, which mutates the single node's state directly. In the case of a multi- node cluster, the RPC calls are necessary to ensure that the data is replicated via Raft to all of the cluster members.	2024-03-18 18:03:23 -04:00
Tim Gross	c4253470a0	autopilot: add `operator autopilot health` command (#20156 ) Add a command line operation that reports Enterprise autopilot data from the `/operator/autopilot/health` API. I've pulled this feature out of @lindleywhite's PR in the Enterprise repo. Ref: https://github.com/hashicorp/nomad-enterprise/pull/1394 Co-authored-by: Lindley <lindley@hashicorp.com>	2024-03-18 14:46:18 -04:00
Tim Gross	5138c1c82f	autopilot: add Enterprise health information to API endpoint (#20153 ) Add information about autopilot health to the `/operator/autopilot/health` API in Nomad Enterprise. I've pulled the CE changes required for this feature out of @lindleywhite's PR in the Enterprise repo. A separate PR will include a new `operator autopilot health` command that can present this information at the command line. Ref: https://github.com/hashicorp/nomad-enterprise/pull/1394 Co-authored-by: Lindley <lindley@hashicorp.com>	2024-03-18 11:38:17 -04:00
Tim Gross	1cbddfa8ce	acl: remove unused nil ACL object handling (#20150 ) As of #18754 which shipped in Nomad 1.7, we no longer need to nil-check the object returned by `ResolveACL` if there's no error return, because in the case where ACLs are disabled we return a special "ACLs disabled" ACL object. Checking nil is not a bug but should be discouraged because it opens us up to future bugs that would bypass ACLs. While working on an unrelated feature @lindleywhite discovered that we missed removing the nil check from several endpoints with our semgrep linter. This changeset fixes that. Co-Author: Lindley <lindley@hashicorp.com>	2024-03-18 10:04:51 -04:00
Tim Gross	695bb7ffcf	docs: improve wording around autoconfiguration via Consul (#20139 ) Fixes: https://github.com/hashicorp/nomad/issues/20132	2024-03-15 08:44:58 -04:00
Tim Gross	db195726a5	cli: add options to help string for `acl policy info` (#20138 ) Fixes: https://github.com/hashicorp/nomad/issues/20117	2024-03-15 08:44:50 -04:00
Juana De La Cuesta	ff72248c86	func: add new picker dependency (#20029 ) This commit introduces the new options for reconciling a reconnecting allocation and its replacement: Best score (Current implementation) Keep original Keep replacement Keep the one that has run the longest time It is achieved by adding a new dependency to the allocReconciler that calls the corresponding function depending on the task group's disconnect strategy. For more detailed information, refer to the new stanza for disconnected clientes RFC. It resolves 15144	2024-03-15 13:42:08 +01:00
Tim Gross	13617eee4b	template: improve internal documentation around shutdown (#20134 ) While investigating a report around possible consul-template shutdown issues, which didn't bear fruit, I found that some of the logic around template runner shutdown is unintuitive. * Add some doc strings to the places where someone might think we should be obviously stopping the runner or returning early. * Mark context argument for `Poststart`, `Stop`, and `Update` hooks as unused. No functional code changes.	2024-03-14 15:33:32 -04:00
Amir Abbas	40b8f17717	Support insecure flag on artifact (#20126 )	2024-03-14 10:59:20 -05:00
Seth Hoenig	bb54d16e4a	exec2: setup RPC plumbing for dynamic workload users (#20129 ) And pass the dynamic users pool from the client into the hook.	2024-03-13 14:06:52 -05:00
Seth Hoenig	05937ab75b	exec2: add client support for unveil filesystem isolation mode (#20115 ) * exec2: add client support for unveil filesystem isolation mode This PR adds support for a new filesystem isolation mode, "Unveil". The mode introduces a "alloc_mounts" directory where tasks have user-owned directory structure which are bind mounts into the real alloc directory structure. This enables a task driver to use landlock (and maybe the real unveil on openbsd one day) to isolate a task to the task owned directory structure, providing sandboxing. * actually create alloc-mounts-dir directory * fix doc strings about alloc mount dir paths	2024-03-13 08:24:17 -05:00
Piotr Kazmierczak	428103ba12	Merge pull request #20122 from hashicorp/post-1.7.6-release Post 1.7.6 release	2024-03-12 12:24:41 +01:00
Piotr Kazmierczak	ec76b7768f	Merge release 1.7.6 files	2024-03-12 12:06:26 +01:00
hc-github-team-nomad-core	472ba2c740	Prepare for next release	2024-03-12 12:04:04 +01:00
hc-github-team-nomad-core	46182c2a83	Generate files for 1.7.6 release	2024-03-12 12:04:04 +01:00
carrychair	5f5b34db0e	remove repetitive words (#20110 ) Signed-off-by: carrychair <linghuchong404@gmail.com>	2024-03-11 08:52:08 +00:00
Seth Hoenig	286dce7a2a	exec2: add a client.users configuration block (#20093 ) * exec: add a client.users configuration block For now just add min/max dynamic user values; soon we can also absorb the "user.denylist" and "user.checked_drivers" options from the deprecated client.options map. * give the no-op pool implementation a better name * use explicit error types to make referencing them cleaner in tests * use import alias to not shadow package name	2024-03-08 16:02:32 -06:00
Giovanni Avelar	26a27bb12c	cli: add -json option on jobs status command (#18925 )	2024-03-08 16:03:52 -05:00
Luke Kysow	9c3bbd191a	Bump consul-template to 0.37.2 (#20105 )	2024-03-08 14:56:35 -05:00
Tim Gross	ac366521f2	deps: upgrade protobuf lib to 1.33.0 (#20100 ) Although Nomad is not vulnerable to CVE-2024-24786 because it's configured to discard unknown messages during unmarshaling, we should upgrade so that third-party vulnerability scanners don't detect the vulnerable version and complain. Also update go1.22.1 changelog entry to include CVEs	2024-03-08 10:55:55 -05:00
Seth Hoenig	2c1f5daad7	more test refactoring (#20092 ) * tests: swap testify for test in client/config * tests: swap testify for test in logmon/	2024-03-07 11:04:16 -06:00
Michael Schurter	3193ac204f	docs: skipping a major release is fine (#20075 ) Nomad has always placed an extremely high priority on backward compatibility. We have always aimed to support N-2 major releases and usually gone above and beyond that. The new https://www.hashicorp.com/long-term-support policy also mentions that N-2 is what we have always supported, so it's probably time for our docs to reflect that reality.	2024-03-06 08:57:12 -08:00
Michael Schurter	82fe2b5df6	docs: fix s/port-plan-failure (#20079 ) Fixes #20070	2024-03-06 08:56:31 -08:00
Seth Hoenig	55b0795866	build: upgrade to go1.22 (#20066 ) * build: upgrade to go1.22 * add cl * build: use codecgen from go-msgpack v1.1.5+base32 and stringer 0.18.0 for compatability with go1.22 * ci: update golangci-lint to 1.56.2 * build: update hclogvet for go1.22 * build: bump to go1.22.1	2024-03-06 09:54:04 -06:00
Seth Hoenig	67554b8f91	exec2: implement dynamic workload users taskrunner hook (#20069 ) * exec2: implement dynamic workload users taskrunner hook This PR impelements a TR hook for allocating dynamic workload users from a pool managed by the Nomad client. This adds a new task driver Capability, DynamicWorkloadUsers - which a task driver must indicate in order to make use of this feature. The client config plumbing is coming in a followup PR - in the RFC we realized having a client.users block would be nice to have, with some additional unrelated options being moved from the deprecated client.options config. * learn to spell	2024-03-06 09:34:27 -06:00
Mark Johnston	3e7191ccb7	Fix wording of ACL error message (#20071 ) When creating a job ACL, you must supply a job ID if you supply a namespace. If you try to give a namespace without a job ID, the error states "JobACL.JobID without Namespace" instead.	2024-03-05 16:49:28 -08:00
Phil Renaud	7820df53ca	[ui]] Percy Stabilization (#20061 ) * Some actions and inline-chart stabilization * Weird little semicolon, are you my undoing?	2024-03-05 08:49:58 -05:00
Seth Hoenig	57bd39061b	exec2: implement a dynamic users pool (#20065 ) * exec2: implement a dynamic users pool This PR adds an implementation of a Pool from which dynamic users can be allocated on behalf of tasks making use of an upcoming feature of Nomad client (dynamic users). A task hook and client plumbing, etc. will be in follow up PRs. * no need for randomness assertion	2024-03-05 07:35:20 -06:00
Seth Hoenig	06a4fcb7d5	build: update the actions/checkout version (#20067 )	2024-03-04 13:01:38 -06:00
Soren L. Hansen	96acddbc13	Avoid NPE in nomad/command/job_restart.go (#20049 ) stopAlloc() checks if an allocation represents a system job like this: ``` if alloc.Job.Type == api.JobTypeSystem { ... } ``` This caused the cli to crash: ``` ==> 2024-02-29T08:45:53+01:00: Restarting 2 allocations 2024-02-29T08:45:54+01:00: Rescheduling allocation "6a9da11a" for group "redacted-group" panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x2 addr=0x20 pc=0x10686affc] goroutine 36 [running]: github.com/hashicorp/nomad/command.(JobRestartCommand).stopAlloc(0x14000b11040, {0x14000996dc0?, 0x0?}) github.com/hashicorp/nomad/command/job_restart.go:968 +0x25c github.com/hashicorp/nomad/command.(JobRestartCommand).handleAlloc(0x14000b11040, {0x14000996dc0?, 0x0?}) github.com/hashicorp/nomad/command/job_restart.go:868 +0x34 github.com/hashicorp/nomad/command.(JobRestartCommand).Run.(JobRestartCommand).Run.func1.func2() github.com/hashicorp/nomad/command/job_restart.go:392 +0x28 github.com/hashicorp/go-multierror.(Group).Go.func1() github.com/hashicorp/go-multierror@v1.1.1/group.go:23 +0x60 created by github.com/hashicorp/go-multierror.(*Group).Go in goroutine 1 github.com/hashicorp/go-multierror@v1.1.1/group.go:20 +0x84 ``` Attaching a debugger revealed that `alloc.Job` was set, but `alloc.Job.Type` was nil. After guarding the `.Type` check with a `alloc.Job.Type != nil`, it still crashed. This time, `alloc.Job` was nil. I was scrambling to get the job running again, so I didn't have the opportunity to find out why those values were nil, but this change ensures the CLI does not crash in these situations. Fixes #20048	2024-03-01 08:07:28 -06:00
Seth Hoenig	a66f7ba888	ci: update macos runners to macos-14 (apple silicon) (#20054 )	2024-02-29 14:31:59 -06:00
Seth Hoenig	4d83733909	tests: swap testify for test in more places (#20028 ) * tests: swap testify for test in plugins/csi/client_test.go * tests: swap testify for test in testutil/ * tests: swap testify for test in host_test.go * tests: swap testify for test in plugin_test.go * tests: swap testify for test in utils_test.go * tests: swap testify for test in scheduler/ * tests: swap testify for test in parse_test.go * tests: swap testify for test in attribute_test.go * tests: swap testify for test in plugins/drivers/ * tests: swap testify for test in command/ * tests: fixup some test usages * go: run go mod tidy * windows: cpuset test only on linux	2024-02-29 12:11:35 -06:00
Phil Renaud	c2fe51bf11	Fixes an issue where shift+num would not open an eval on the evaluations index table (#20047 )	2024-02-29 11:25:52 -06:00
James Rasell	8f3f2a8c5c	docs: fix autoscaler variable ACL policy example. (#20050 )	2024-02-29 15:44:29 +00:00
Jeff Boruszak	57af1cdcbf	docs: Consul Admin partition example (#20022 )	2024-02-28 09:04:04 -06:00
James Rasell	dfda021aaf	docs: add autoscaler ACL policy requirements. (#20041 ) Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2024-02-28 14:19:38 +00:00
Soren L. Hansen	14280e0820	Prevent NPE when service lacks identity (#19987 ) Fixes a null pointer exception if `Alloc.SignIdentities` was called for any service and any service lacked an identity. Fixes #19986	2024-02-22 09:01:06 -05:00
Luiz Aoqui	cce72cddfd	docs: add Autoscaler `query_window_offset` config (#19942 )	2024-02-20 17:01:30 -05:00
Michael Schurter	b3a4c80f8c	docs: fix s/envoy-bootstrap-error redirect (#20015 ) And cleanup whitespace	2024-02-20 12:26:26 -08:00
Mike Nomitch	18e5e168f4	Remove accidental console log in namespace test setup (#19874 )	2024-02-20 11:42:11 -08:00
Tim Gross	45b2c34532	cni: add DNS set by CNI plugins to task configuration (#20007 ) CNI plugins may set DNS configuration, but this isn't threaded through to the task configuration so that we can write it to the `/etc/resolv.conf` file as needed. Add the `AllocNetworkStatus` to the alloc hook resources so they're accessible from the taskrunner. Any DNS entries provided by the user will override these values. Fixes: https://github.com/hashicorp/nomad/issues/11102	2024-02-20 10:17:27 -05:00

1 2 3 4 5 ...

25668 Commits