Commit Graph

25517 Commits

Author SHA1 Message Date
James Rasell
a3a03dff78 acl: ensure auth method configs are correctly and fully hashed. (#19677) 2024-01-09 14:03:26 +00:00
dependabot[bot]
f3bc9c7c41 chore(deps): bump github.com/docker/docker (#19672) 2024-01-09 08:24:20 +00:00
Tim Gross
a399f16a31 docs: describe cgroup controller requirements (#19493)
Nomad can only use cgroups to control resource requirements if all the cgroups
controllers are actually enabled. Add this to our requirements documentation as
well as the impacted `exec` and `java` task drivers.
2024-01-08 10:01:14 -05:00
am-ak
7dc82f233f [DOCS] Update docker.mdx (#19657)
Removed info regarding development of Nomad
2024-01-08 14:32:57 +00:00
James Rasell
fbea8d1051 server: Fix panic when validating non-service reschedule block. (#19652) 2024-01-08 14:14:00 +00:00
Shantanu Gadgil
6bbd3b0cec reschedule is at group level (#19653)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2024-01-08 10:54:52 +00:00
dependabot[bot]
398b5000c1 chore(deps): bump github.com/hashicorp/go-plugin from 1.4.10 to 1.6.0 (#19646)
Co-authored-by: James Rasell <jrasell@hashicorp.com>
2024-01-08 08:26:34 +00:00
James Rasell
ff2d0d6453 cli: Fix dummy FSM create to ensure snapshot state command works. (#19630)
The Nomad state store function was recently updated to validate
certain parameters, fixing a panic condition. This change meant
dummy FSM used for the snapshot state command was always failing
this validation and the command no longer worked.

This change adds the required parameter to pass validation and
therefore makes the CLI command functional again.
2024-01-05 16:00:24 +00:00
Marvin Chin
be8575a8a2 Fix server shutdown not waiting for worker run completion (#19560)
* Move group into a separate helper module for reuse

* Add shutdownCh to worker

The shutdown channel is used to signal that worker has stopped.

* Make server shutdown block on workers' shutdownCh

* Fix waiting for eval broker state change blocking indefinitely

There was a race condition in the GenericNotifier between the
Run and WaitForChange functions, where WaitForChange blocks
trying to write to a full unsubscribeCh, but the Run function never
reads from the unsubscribeCh as it has already stopped.

This commit fixes it by unblocking if the notifier has been stopped.

* Bound the amount of time server shutdown waits on worker completion

* Fix lostcancel linter error

* Fix worker test using unexpected worker constructor

* Add changelog

---------

Co-authored-by: Marvin Chin <marvinchin@users.noreply.github.com>
2024-01-05 08:45:07 -06:00
James Rasell
5a00440b06 api: Fix operator snapshot API streaming. (#19608) 2024-01-05 14:33:39 +00:00
dependabot[bot]
37af843b01 chore(deps): bump github.com/opencontainers/runc from 1.1.8 to 1.1.10 (#19289) 2024-01-05 09:57:54 +00:00
dependabot[bot]
c2e6d8aee2 build(deps): bump github.com/containerd/containerd from 1.6.18 to 1.6.26 (#19531) 2024-01-05 09:29:14 +00:00
James Rasell
f3ed406b0f state: ensure the job submission table is persisted and restored. (#19605) 2024-01-05 08:12:27 +00:00
James Rasell
2abbd7e485 cli: fix operator snapshot save help output examples. (#19606) 2024-01-05 07:43:12 +00:00
Phil Renaud
a5881963dd Error message typo fix: Filed to Failed (#19611) 2024-01-04 21:56:23 -05:00
Phil Renaud
16876697a1 [ui] Adds group-name tooltips to deploying and steady-state job panels (#19601)
* Adds group-name tooltips to deploying and steady-state job panels

* Default tooltip text for mirage edge cases
2024-01-04 13:10:37 -05:00
Phil Renaud
75b830ef04 [ui] Changelog for multi-line variables (#19600)
* Changelog for multi-line variables

* Multi-entry changelog
2024-01-04 12:00:50 -05:00
Seth Hoenig
4b3ee77d6b docs: update raw_exec driver docs and 1.7 upgrade notes (#19598) 2024-01-04 08:26:46 -06:00
Seth Hoenig
ccfb13a72d e2e: add test for raw_exec memory_max configuration (#19596)
* e2e: add test for raw_exec memory_max configuration

* docs: note raw_exec supports memory_max in resources documentation
2024-01-04 08:25:56 -06:00
Piotr Kazmierczak
aa197cf824 e2e: pass Nomad address to Consul WI test (#19603) 2024-01-04 08:52:39 +01:00
Phil Renaud
89cceebb91 [ui] Multi-line variable values and helios upgrades generally (#19544)
* Multi-line variable values and helios upgrades generally

* Variables page titles and actions restyle

* Hacky fix to keyboard shortcut otherwise bumping space on shift

* Related entities heliosified

* Namespace and path fields heliosed

* Paths table heliosified

* Variable view table

* Fixups after design discussion

* Monospaced editing

* De-commented template placeholder

* Acceptance tests updated for helios components across variables

* Tests helios'd in variable-form-test

* PR suggestions
2024-01-03 15:54:22 -05:00
Marvin Chin
d75293d2ab Add OOM detection for exec driver (#19563)
* Add OomKilled field to executor proto format

* Teach linux executor to detect and report OOMs

* Teach exec driver to propagate OOMKill information

* Fix data race

* use tail /dev/zero to create oom condition

* use new test framework

* minor tweaks to executor test

* add cl entry

* remove type conversion

---------

Co-authored-by: Marvin Chin <marvinchin@users.noreply.github.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
2024-01-03 09:50:27 -06:00
Tim Gross
f2630add91 acl: remove timestamps from WhoAmI response (#19578)
In Nomad 1.7 we updated our JWT library to go-jose, but this changed the wire
format of the embedded struct we have in the `IdentityClaims` struct that we
return as part of the `WhoAmI` RPC response. This wasn't originally intended to
be sent over the wire but other changes in Nomad 1.5+ added a caller to the
client. The library change causes a deserialization error on Nomad 1.5 and 1.6
clients, which prevents access to Nomad Variables and SD via template blocks.

Removed the incompatible fields from the response, which are unused by any
current caller. In a future version of Nomad, we'll likely remove the `WhoAmI`
callers from the client in lieu of using the public keys the clients have to
check auth.

Fixes: https://github.com/hashicorp/nomad/issues/19555
2024-01-03 08:24:38 -05:00
James Rasell
91cba75f5c copywrite: fix and add copywrite config enterprise comments. (#19590)
Nomad CI checks for copywrite headers using multiple config files
for specific exemption paths. This means the top-level config file
does not take effect when running the copywrite script within
these sub-folders. Exempt files therefore need to be added to the
sub-config files, along with the top level.
2024-01-03 08:58:53 +00:00
Piotr Kazmierczak
a87aa71f55 e2e: fix typo in Consul e2e (#19589) 2024-01-03 09:34:38 +01:00
Tim Gross
e7ca2b51ad vault: ignore allow_unauthenticated config if identity is set (#19585)
When the server's `vault` block has a default identity, we don't check the
user's Vault token (and in fact, we warn them on job submit if they've provided
one). But the validation hook still checks for a token if
`allow_unauthenticated` is set to true. This is a misconfiguration but there's
no reason for Nomad not to do the expected thing here.

Fixes: https://github.com/hashicorp/nomad/issues/19565
2024-01-02 16:46:34 -05:00
Luiz Aoqui
cd8a03431c docs: add scale_in_protection to AWS Autoscaler (#19546)
Document new `scale_in_protection` configuration of the AWS ASG
Autoscaler target plugin.
2024-01-02 14:48:56 -05:00
Luiz Aoqui
0bef6f05a2 docs: add note about * namespace on autoscaling (#19547)
Explain the behaviour when the wildcard namespace value `*` is used to
configure the Nomad Autoscaler agent.
2024-01-02 14:48:20 -05:00
Matt Robenolt
656bb5cafa drivers/executor: set oom_score_adj for raw_exec (#19515)
* drivers/executor: set oom_score_adj for raw_exec

This might not be wholly true since I don't know all configurations of
Nomad, but in our use cases, we run some of our tasks as `raw_exec` for
reasons.

We observed that our tasks were running with `oom_score_adj = -1000`,
which prevents them from being OOM'd. This value is being inherited from
the nomad agent parent process, as configured by systemd.

Similar to #10698, we also were shocked to have this value inherited
down to every child process and believe that we should also set this
value to 0 explicitly.

I have no idea if there are other paths that might leverage this or
other ways that `raw_exec` can manifest, but this is how I was able to
observe and fix in one of our configurations.

We have been running in production our tasks wrapped in a script that
does: `echo 0 > /proc/self/oom_score_adj` to avoid this issue.

* drivers/executor: minor cleanup of setting oom adjustment

* e2e: add test for raw_exec oom adjust score

* e2e: set oom score adjust to -999

* cl: add cl

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2024-01-02 13:35:09 -06:00
Seth Hoenig
c06f804cea build: make copywrite thing happy (#19577) 2024-01-02 13:33:45 -06:00
Luiz Aoqui
7eecca65ec docs: add autoscaler AWS retry_attempts config (#19549)
Document the Nomad Autoscaler AWS target plugin config `retry_attempts`.
2024-01-02 14:08:10 -05:00
Luiz Aoqui
56b1bf3240 docs: add policy_id and target_name metric labels (#19551) 2024-01-02 14:06:37 -05:00
Luiz Aoqui
1694e69b77 docs: clarify the behaviour of lower_bound and upper_bound (#19552) 2024-01-02 14:06:07 -05:00
hc-github-team-es-release-engineering
a4ecc2fbc8 Merge pull request #19283 from hashicorp/RELENG-960-EOY-license-fixes
[DO NOT MERGE UNTIL EOY] update year in LICENSE and copywrite files
2024-01-02 09:38:54 -08:00
Seth Hoenig
23e5ffbfd0 build: bump setup-golang action version to v2 (#19568) 2024-01-02 09:41:50 -06:00
Luiz Aoqui
09731442e4 docs: add node_pool autoscaler node selector (#19548)
Document the `node_pool` node selector configuration.
2024-01-02 10:19:58 -05:00
Piotr Kazmierczak
bb3d2227a2 e2e: add a test for checking default WI Consul workflow for services and tasks (#19500) 2024-01-02 16:02:32 +01:00
James Rasell
76ba3e10e7 docs: add Nomad Autoscaler HA configuration details. (#19010)
Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2023-12-27 08:00:07 +00:00
Mike Nomitch
dd15bdff9c Adds vault role to JWT claims if specified in jobspec (#19535) 2023-12-20 15:51:34 -08:00
Piotr Kazmierczak
84115d732d docs: correct Nomad Autoscaler example link in HA vars documentation (#19537) 2023-12-20 16:26:35 +01:00
Phil Renaud
005147f850 [ui] Mask token secret when logged in (#19529)
* Sign-in page now hides token secret by default (toggleable) and updates components to Helios

* General helios-ification

* All the notifications get dismissal buttons

* token-details grid for spacing
2023-12-20 10:04:53 -05:00
Phil Renaud
e26c2e243c [ui] node eligibilty taken into consideration when clients list filtered to "ready" (#18607)
* node eligibilty taken into consideration when clients list filtered to 'ready'

* A working draft of complex positive querying

* tags and filter badge

* CompositeStatus -> Status

* Buttons within a Helios SegmentedGroup

* Convert the other dropdowns to helios on clients index

* A bunch of client index test fixes

* Remaining clients list acceptance tests for State facet modified
2023-12-19 16:40:56 -05:00
Luiz Aoqui
e4e70b086a ci: run linter in ./api package (#19513) 2023-12-19 15:59:47 -05:00
Luiz Aoqui
95766aaa1b docs: add Submission parameter to job update (#19516) 2023-12-19 10:09:16 -05:00
Luiz Aoqui
859606a54a consul: fix parsing of service.cluster field (#19510) 2023-12-19 09:55:41 -05:00
dependabot[bot]
b2f640346d build(deps): bump golang.org/x/crypto from 0.14.0 to 0.17.0 (#19514) 2023-12-19 11:17:48 +00:00
Etienne Bruines
f18d5c7c32 docs: fix migration to workload identity links (#19508)
Fixes #19507
2023-12-18 21:27:38 -05:00
Luiz Aoqui
dfce76e511 ui: fix AllocationRow for job without action (#19505)
The allocation table header sometimes conditionally renders the
`Actions` table column, but the allocation row would render it
unconditionally, resulting in broken tables when rendering allocations
for jobs without actions, where rows had more columns than the header.

Also fix the conditional class for the deployments allocation table to
read `length` from the right value.
2023-12-18 11:30:20 -05:00
Phil Renaud
7a87049eab Merge pull request #18823 from Sanskar531/ui-logs-disabled-message
UI: Show message for when log collection is disabled
2023-12-18 09:20:51 -05:00
Sanskar Gauchan
e0e8357661 Merge branch 'hashicorp:main' into ui-logs-disabled-message 2023-12-16 10:49:26 +11:00