nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-05 01:45:44 +03:00

Author	SHA1	Message	Date
Kevin Mulvey	ea37488e54	check in stderrFrame is nil before logging stderrFrame.Data (#17815 )	2023-07-24 09:33:14 +01:00
James Rasell	2a91bf4469	node-pool: fix validate name function comment typo. (#17927 )	2023-07-24 08:28:05 +01:00
stswidwinski	b9a388f5df	Retain task states for post stop tasks at the time of node GC (#18005 ) * Retain task states for post stop tasks at the time of node GC	2023-07-21 10:55:00 -07:00
Tim Gross	4768c2a455	Merge pull request #18028 from hashicorp/post-1.6.1-release Post 1.6.1 release	2023-07-21 11:31:34 -04:00
hc-github-team-nomad-core	0bcc20e9e5	Prepare for next release	2023-07-21 11:12:00 -04:00
hc-github-team-nomad-core	583f8773fa	Generate files for 1.6.1 release	2023-07-21 11:09:15 -04:00
Phil Renaud	91e1bafbac	Changelog entry for remote purge boot-out (#18026 )	2023-07-21 09:21:02 -04:00
Luiz Aoqui	2b3dd86dc5	ui: handle node pool requests to older regions (#18021 ) When accessing a region running a version of Nomad without node pools an error was thrown because the request is handled by the nodes endpoint which fails because it assumes `pools` is the node ID.	2023-07-21 09:16:49 -04:00
Luiz Aoqui	5d3639f304	ui: handle errors from unimplemented services (#18020 ) When a request is made to an RPC service that doesn't exist (for example, a cross-region request from a newer version of Nomad to an older version that doesn't implement the endpoint) the application should return an empty list as well.	2023-07-21 09:16:35 -04:00
Luiz Aoqui	f8b9b5c387	state: canonicalize namespace on restore (#18017 ) The upgrade path to Nomad 1.6.0 requires canonicalizing the namespace in order to set the default scheduler configuration values. Previous implementation only canonicalized on namespace upsert operations, which works for recent namespaces as those Raft transactions are reapplied on upgrade. But for older namespaces restore from a snapshot the code path did not canonicalize them, leaving the scheduler configuration set as `nil`.	2023-07-20 16:04:51 -04:00
Tim Gross	f52912454d	CSI: improve controller RPC reliability (#17996 ) The CSI specification says that we "SHOULD" send no more than one in-flight request per volume at a time, with an allowance for losing state (ex. leadership transitions) which the plugins "SHOULD" handle gracefully. We mostly successfully serialize node and controller RPCs for the same volume, except when Nomad clients are lost. (See also https://github.com/container-storage-interface/spec/issues/512) These concurrency requirements in the spec fall short because Storage Provider APIs aren't necessarily safe to call concurrently on the same host even for _different_ volumes. For example, concurrently attaching AWS EBS volumes to an EC2 instance results in a race for device names, which results in failure to attach (because the device name is taken already and the API call fails) and confused results when releasing claims. So in practice many CSI plugins rely on k8s-specific sidecars for serializing storage provider API calls globally. As a result, we have to be much more conservative about concurrency in Nomad than the spec allows. This changeset includes four major changes to fix this: * Add a serializer method to the CSI volume RPC handler. When the RPC handler makes a destructive CSI Controller RPC, we send the RPC thru this serializer and only one RPC is sent at a time. Any other RPCs in flight will block. * Ensure that requests go to the same controller plugin instance whenever possible by sorting by lowest client ID out of the plugin instances. * Ensure that requests go to _healthy_ plugin instances only. * Ensure that requests for controllers can go to a controller on any _live_ node, not just ones eligible for scheduling (which CSI controllers don't care about) Fixes: #15415	2023-07-20 14:51:51 -04:00
Phil Renaud	94112d8cfd	Copy button added to variables title (#17935 )	2023-07-20 14:16:33 -04:00
Phil Renaud	6bed12f693	Copy change to include the nomad/jobs all-access variable prefix (#17933 )	2023-07-20 14:16:14 -04:00
Phil Renaud	51393ddde7	Adds N V for variable creation as a keyboard shortcut (#17932 )	2023-07-20 12:38:35 -04:00
Phil Renaud	287ad19f0f	[ui] When a purged/404-ing job is detected, boot the user out of that job and back to the index (#17915 ) * Boot the user off the job if it gets deleted * de-yoink * watching the job watcher * Unload record so history.back has to refire a (failing) request * Acceptance tests for boot-out and notification	2023-07-20 12:36:43 -04:00
Seth Hoenig	8d28946993	e2e podman private registry (#17642 ) * e2e: add tests for using private registry with podman driver This PR adds e2e tests that stands up a private docker registry and has a podman tasks run a container from an image in that private registry. Tests - user:password set in task config - auth_soft_fail works for public images when auth is set in driver - credentials helper is set in driver auth config - config auth.json file is set in driver auth config * packer: use nomad-driver-podman v0.5.0 * e2e: eliminate unnecessary chmod Co-authored-by: Daniel Bennett <dbennett@hashicorp.com> * cr: no need to install nomad twice * cl: no need to install docker twice --------- Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>	2023-07-19 15:59:36 -05:00
Luiz Aoqui	ce0f60fb68	metrics: report task memory_max value (#17938 ) Add new `nomad.client.allocs.memory.max_allocated` metric to report the value of the task `memory_max` resource value.	2023-07-19 16:50:12 -04:00
Luiz Aoqui	e664f1439a	nsd: retain query params in HTTP health checks (#17936 ) Apply the same logic as Consul service health checks when building the HTTP URL so that query params in `path` are preserved.	2023-07-19 16:46:22 -04:00
Luiz Aoqui	969ea54628	ui: fix Topology node state filter (#17940 ) "Ineligible" and "Draining" are not determined by the node status, but are rather inferred from other fields.	2023-07-19 16:38:20 -04:00
Nando	ca26673781	volume-status : show namespace the volume belongs to (#17911 ) * volume-status : show namespace the volume belongs to	2023-07-19 16:36:51 -04:00
Luiz Aoqui	bd3ef90f8f	changelog: add entry for #17731 into 1.4.11 (#17994 )	2023-07-19 16:24:39 -04:00
Patric Stout	e190eae395	Use config "cpu_total_compute" (if set) for all CPU statistics (#17628 ) Before this commit, it was only used for fingerprinting, but not for CPU stats on nodes or tasks. This meant that if the auto-detection failed, setting the cpu_total_compute didn't resolved the issue. This issue was most noticeable on ARM64, as there auto-detection always failed.	2023-07-19 13:30:47 -05:00
louievandyke	0d343f269a	docs: updating to specify mTLS rpc endpoints (#17963 )	2023-07-19 14:16:35 -04:00
Luiz Aoqui	a04245d9a6	Merge pull request #17986 from hashicorp/post-1.6.0-release Post 1.6.0 release	2023-07-19 10:55:04 -04:00
Luiz Aoqui	47fb70bbc2	Merge release 1.6.0 files	2023-07-19 10:45:20 -04:00
hc-github-team-nomad-core	bc8b4bd749	Prepare for next release	2023-07-19 10:38:08 -04:00
hc-github-team-nomad-core	573cab2b1d	Generate files for 1.6.0 release	2023-07-19 10:38:08 -04:00
Tim Gross	a8789d3872	search: fix ACL filtering for plugins and variables ACL permissions for the search endpoints are done in three passes. The first (the `sufficientSearchPerms` method) is for performance and coarsely rejects requests based on the passed-in context parameter if the user has no permissions to any object in that context. The second (the `filteredSearchContexts` method) filters out contexts based on whether the user has permissions either to the requested namespace or again by context (to catch the "all" context). Finally, when iterating over the objects available, we do the usual filtering in the iterator. Internal testing found several bugs in this filtering: * CSI plugins can be searched by any authenticated user. * Variables can be searched if the user has `job:read` permissions to the variable's namespace instead of `variable:list`. * Variables cannot be searched by wildcard namespace. This is an information leak of the plugin names and variable paths, which we don't consider to be privileged information but intended to protect anyways. This changeset fixes these bugs by ensuring CSI plugins are filtered in the 1st and 2nd pass ACL filters, and changes variables to check `variable:list` in the 2nd pass filter unless the wildcard namespace is passed (at which point we'll fallback to filtering in the iterator). Fixes: CVE-2023-3300 Fixes: #17906	2023-07-19 10:38:08 -04:00
Luiz Aoqui	54c45ed106	acl: fix parsing of policies with blocks w/o label An ACL policy with a block without label generates unexpected results. For example, a policy such as this: ``` namespace { policy = "read" } ``` Is applied to a namespace called `policy` instead of the documented behaviour of applying it to the `default` namespace. This happens because of the way HCL1 decodes blocks. Since it doesn't know if a block is expected to have a label it applies the `key` tag to the content of the block and, in the example above, the first key is `policy`, so it sets that as the `namespace` block label. Since this happens internally in the HCL decoder it's not possible to detect the problem externally. Fixing the problem inside the decoder is challenging because the JSON and HCL parsers generate different ASTs that makes impossible to differentiate between a JSON tree from an invalid HCL tree within the decoder. The fix in this commit consists of manually parsing the policy after decoding to clear labels that were not set in the file. This allows the validation rules to consistently catch and return any errors, no matter if the policy is an invalid HCL or JSON.	2023-07-19 10:38:08 -04:00
Charlie Voiselle	d23aaed14b	redact token before passing to sentinel	2023-07-19 10:38:08 -04:00
James Rasell	0015d25344	qemu: add test to cover task store functions. (#17967 )	2023-07-19 15:35:16 +01:00
James Rasell	81aa274551	qemu: fix log lines to use correct QEMU capitalization. (#17961 )	2023-07-19 15:21:15 +01:00
James Rasell	81c14dee3c	test: enable exec test previously disabled due to CircleCI (#17975 )	2023-07-19 15:15:11 +01:00
James Rasell	3abb1124c3	copywrite: add placeholder for OSS/ENT ignore split. (#17965 )	2023-07-19 11:29:06 +01:00
Charlie Voiselle	3a687930bd	Fix typos (#17962 )	2023-07-18 13:31:36 -04:00
Seth Hoenig	1e7726ce93	docs: note windows requirement for workload identity (#17950 ) Support for UDS sockets was added to Windows 10.	2023-07-14 12:51:25 -05:00
James Rasell	7f5d39fc02	docs: fix QEMU driver HCL plugin options example formatting. (#17930 )	2023-07-14 10:22:45 +01:00
János Szathmáry	e53955bccc	fix allowed metric values documentation for the Nomad APM plugin (#17928 )	2023-07-13 18:25:40 -04:00
Phil Renaud	437941816c	Tells the token to be 2 seconds faster in a 5.5 second test (#17924 )	2023-07-12 16:59:22 -04:00
Seth Hoenig	159bf51120	e2e: add some e2e tests for pledge task driver (#17909 ) * e2e: setup nomad for pledge driver * e2e: add some e2e tests for pledge task driver	2023-07-12 11:56:08 -05:00
James Rasell	74335b3bfe	ci: add copywrite action to check file headers. (#17889 )	2023-07-12 16:02:43 +01:00
Daniel Kimsey	995b936aca	Smoke test binaries for EL7 compatiblity (#17706 ) This adds a quick smoke test of our binaries to verify we haven't exceeeded the maximum GLIBC (2.17) version during linking which would break our ability to execute on EL7 machines.	2023-07-12 10:51:26 -04:00
Tim Gross	3c481d3f25	Merge pull request #17914 from hashicorp/post-1.6.0-rc.1-release Post 1.6.0 rc.1 release	2023-07-12 10:03:59 -04:00
hc-github-team-nomad-core	09c89e79d2	Prepare for next release	2023-07-12 09:54:14 -04:00
hc-github-team-nomad-core	335bb8b9e1	Generate files for 1.6.0-rc.1 release	2023-07-12 09:54:14 -04:00
Tim Gross	3656de6fe2	Prepare release 1.6.0-rc.1	2023-07-12 09:54:14 -04:00
James Rasell	3cfa267439	changelog: fix link to unsupported changelog file. (#17913 )	2023-07-12 14:45:53 +01:00
Seth Hoenig	8253ec86a2	docs: add plugin docs for pledge task driver (#17823 ) * docs: add plugin docs for pledge task driver Add pledge driver to the set of Community drivers. * docs: cr feedback	2023-07-11 16:41:14 -05:00
Seth Hoenig	fd50f2bcb8	e2e: do not set a user for raw_exec tasks (#17901 ) Cannot set a user for raw_exec tasks, because doing so does not work with the 0700 root owned client data directory that we setup in the e2e cluster in accordance with the Nomad hardening guide.	2023-07-11 16:00:15 -05:00
Seth Hoenig	a4d0dcdc39	docs: update podman driver docs with v0.5.0 changes (#17824 )	2023-07-11 13:51:25 -05:00

1 2 3 4 5 ...

24912 Commits