Commit Graph

4420 Commits

Author SHA1 Message Date
Tim Gross
30bc456f03 logs: allow disabling log collection in jobspec (#16962)
Some Nomad users ship application logs out-of-band via syslog. For these users
having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow
disabling the logmon and pointing the task's stdout/stderr to /dev/null.

This changeset is the first of several incremental improvements to log
collection short of full-on logging plugins. The next step will likely be to
extend the internal-only task driver configuration so that cluster
administrators can turn off log collection for the entire driver.

---

Fixes: #11175

Co-authored-by: Thomas Weber <towe75@googlemail.com>
2023-04-24 10:00:27 -04:00
Tim Gross
47b28a6468 docs: fix keyring path in install docs (#16946) 2023-04-20 16:20:39 -04:00
Luiz Aoqui
e169847a64 docs: add missing field Capabilities to Namespace API (#16931) 2023-04-19 08:14:36 -07:00
Luiz Aoqui
ba364fc400 docs: add missing API field JobACL and fix workload identity headers (#16930) 2023-04-19 08:12:58 -07:00
Chris van Meer
c7c8bd4ce2 Updates to the UI block (#16328)
1. On the Consul address, following the recommendation for the HTTPS
   API on port 8501.
2. Add the hint to use HEX values for the colors.
2023-04-18 18:28:17 -07:00
Tim Gross
bf07321eb1 license: show Terminated field in license get command (#16892) 2023-04-17 09:01:43 -04:00
Tim Gross
c3002db815 client: allow drain_on_shutdown configuration (#16827)
Adds a new configuration to clients to optionally allow them to drain their
workloads on shutdown. The client sends the `Node.UpdateDrain` RPC targeting
itself and then monitors the drain state as seen by the server until the drain
is complete or the deadline expires. If it loses connection with the server, it
will monitor local client status instead to ensure allocations are stopped
before exiting.
2023-04-14 15:35:32 -04:00
Michael Schurter
38d0a2f4ef docs: add node meta command docs (#16828)
* docs: add node meta command docs

Fixes #16758

* it helps if you actually add the files to git

* fix typos and examples vs usage
2023-04-12 15:29:33 -07:00
Tim Gross
6a90e8320f E2E: clarify drain -deadline and -force flag behaviors (#16868)
The `-deadline` and `-force` flag for the `nomad node drain` command only cause
the draining to ignore the `migrate` block's healthy deadline, max parallel,
etc. These flags don't have anything to do with the `kill_timeout` or
`shutdown_delay` options of the jobspec.

This changeset fixes the skipped E2E tests so that they validate the intended
behavior, and updates the docs for more clarity.
2023-04-12 15:27:24 -04:00
Tim Gross
504fdf0e43 docs: document signal handling (#16835)
Expand documentation about Nomad's signal handling behaviors, including removing
incorrect information about graceful client shutdowns.
2023-04-11 16:26:39 -04:00
Seth Hoenig
2c44cbb001 api: enable support for setting original job source (#16763)
* api: enable support for setting original source alongside job

This PR adds support for setting job source material along with
the registration of a job.

This includes a new HTTP endpoint and a new RPC endpoint for
making queries for the original source of a job. The
HTTP endpoint is /v1/job/<id>/submission?version=<version> and
the RPC method is Job.GetJobSubmission.

The job source (if submitted, and doing so is always optional), is
stored in the job_submission memdb table, separately from the
actual job. This way we do not incur overhead of reading the large
string field throughout normal job operations.

The server config now includes job_max_source_size for configuring
the maximum size the job source may be, before the server simply
drops the source material. This should help prevent Bad Things from
happening when huge jobs are submitted. If the value is set to 0,
all job source material will be dropped.

* api: avoid writing var content to disk for parsing

* api: move submission validation into RPC layer

* api: return an error if updating a job submission without namespace or job id

* api: be exact about the job index we associate a submission with (modify)

* api: reword api docs scheduling

* api: prune all but the last 6 job submissions

* api: protect against nil job submission in job validation

* api: set max job source size in test server

* api: fixups from pr
2023-04-11 08:45:08 -05:00
hashicorp-copywrite[bot]
f005448366 [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
Tim Gross
b8a472d692 ephemeral disk: migrate should imply sticky (#16826)
The `ephemeral_disk` block's `migrate` field allows for best-effort migration of
the ephemeral disk data to new nodes. The documentation says the `migrate` field
is only respected if `sticky=true`, but in fact if client ACLs are not set the
data is migrated even if `sticky=false`.

The existing behavior when client ACLs are disabled has existed since the early
implementation, so "fixing" that case now would silently break backwards
compatibility. Additionally, having `migrate` not imply `sticky` seems
nonsensical: it suggests that if we place on a new node we migrate the data but
if we place on the same node, we throw the data away!

Update so that `migrate=true` implies `sticky=true` as follows:

* The failure mode when client ACLs are enabled comes from the server not passing
  along a migration token. Update the server so that the server provides a
  migration token whenever `migrate=true` and not just when `sticky=true` too.
* Update the scheduler so that `migrate` implies `sticky`.
* Update the client so that we check for `migrate || sticky` where appropriate.
* Refactor the E2E tests to move them off the old framework and make the intention
  of the test more clear.
2023-04-07 16:33:45 -04:00
Tim Gross
37d1bfbd09 docs: remove reference to vSphere from CSI concepts docs (#16765)
The vSphere plugin is exclusive to k8s because it relies on k8s-APIs (and
crashes without them being present). Upstream unfortunately will not support
Nomad, so we shouldn't refer to it in our concept docs here.
2023-04-05 15:20:24 -04:00
James Rasell
4848bfadec cli: stream both stdout and stderr when following an alloc. (#16556)
This update changes the behaviour when following logs from an
allocation, so that both stdout and stderr files streamed when the
operator supplies the follow flag. The previous behaviour is held
when all other flags and situations are provided.

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-04 10:42:27 +01:00
Tim Gross
db2a4e579e docs: fix use of gpg to avoid teeing binary to terminal (#16767) 2023-04-03 10:54:21 -04:00
Tim Gross
d14f685f28 docs: fix install instructions for apt (#16764)
The workflow described in the docs for apt installation is deprecated. Update to
match the workflow described in the Tutorials and official packaging guide.
2023-04-03 10:06:59 -04:00
Daniel Bennett
e6da5c70dc Update enterprise licensing documentation (#16615)
updated various docs for new expiration behavior
and new command `nomad license inspect` to validate pre-upgrade
2023-03-30 16:40:19 -05:00
Horacio Monsalvo
5957880112 connect: add meta on ConsulSidecarService (#16705)
Co-authored-by: Sol-Stiep <sol.stiep@southworks.com>
2023-03-30 16:09:28 -04:00
Piotr Kazmierczak
3da1756dd0 acl: JWT changelog entry and typo fix 2023-03-30 09:40:11 +02:00
Piotr Kazmierczak
46ee09ca7f acl: JWT auth CLI (#16532) 2023-03-30 09:39:56 +02:00
Piotr Kazmierczak
ed068c0831 acl: JWT auth method 2023-03-30 09:39:56 +02:00
Max Fröhlich
cd4c3c9eb4 docs: mention Nomad Admission Control Proxy (#16702) 2023-03-28 15:18:26 -04:00
Tim Gross
43e25410d0 docs: clarify capabilities options for docker driver (#16693)
The `docker` driver cannot expand capabilities beyond the default set when the
task is a non-root user. Clarify this in the documentation of `allow_caps` and
update the `cap_add` and `cap_drop` to match the `exec` driver, which has more
clear language overall.
2023-03-28 13:32:08 -04:00
Tim Gross
cef87f054a docs: add notes about keyring to snapshot restore (#16663)
When cluster administrators restore from Raft snapshot, they also need to ensure the
keyring is in place. For on-prem users doing in-place upgrades this is less of a
concern but for typical cloud workflows where the whole host is replaced, it's
an important warning (at least until #14852 has been implemented).
2023-03-28 08:31:01 -04:00
Tim Gross
be0678bd84 docs: fix template retry attempts default documentation (#16667)
The configuration docs for `client.template.vault_retry`, `consul_retry`, and
`nomad_retry` incorrectly document the default number of attempts to be
unlimited (0). When we added these config blocks, we defaulted the fields to
`nil` for backwards compatibility, which causes them to fall back to the default
consul-template configuration values.
2023-03-28 08:27:06 -04:00
James Rasell
36825576b4 docs: fix-up legacy link in client config page. (#16678) 2023-03-28 09:32:34 +01:00
Tobias Birkefeld
39e7a7e01f docs: fix link of Read Stats API (#16673)
The former link results in a 404. Update the link to the correct developer docs.
2023-03-28 08:49:44 +01:00
ron-savoia
b84c455dbb docs: added section of needed ACL rules for Nomad UI (#16494) 2023-03-24 08:57:16 -04:00
Luiz Aoqui
fffdbdff06 cli: job restart command (#16278)
Implement the new `nomad job restart` command that allows operators to
restart allocations tasks or reschedule then entire allocation.

Restarts can be batched to target multiple allocations in parallel.
Between each batch the command can stop and hold for a predefined time
or until the user confirms that the process should proceed.

This implements the "Stateless Restarts" alternative from the original
RFC
(https://gist.github.com/schmichael/e0b8b2ec1eb146301175fd87ddd46180).
The original concept is still worth implementing, as it allows this
functionality to be exposed over an API that can be consumed by the
Nomad UI and other clients. But the implementation turned out to be more
complex than we initially expected so we thought it would be better to
release a stateless CLI-based implementation first to gather feedback
and validate the restart behaviour.

Co-authored-by: Shishir Mahajan <smahajan@roblox.com>
2023-03-23 18:28:26 -04:00
James Rasell
39ec124bb8 docs: detail support for Nomad checks in service block. (#16598) 2023-03-22 09:27:58 +01:00
Suselz
5309325621 Update csi_plugin.mdx (#16584)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-21 16:16:18 +01:00
James Rasell
96740b5392 docs: remove Java and Scala SDKs from supported list. (#16555) 2023-03-20 15:35:02 +01:00
Juana De La Cuesta
cc110f4cc7 Add -json flag to quota inspect command (#16478)
* Added  and  flag to  command

* cli[style]: small refactor to avoid confussion with tmpl variable

* Update inspect.mdx

* cli: add changelog entry

* Update .changelog/16478.txt

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update command/quota_inspect.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:40:51 +01:00
Juana De La Cuesta
26b4fcc2c9 cli: add -json and -t flags to quota status command (#16485)
* cli: add json and t flags to quota status command

* cli: add entry to changelog

* Update command/quota_status.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:56 +01:00
Juana De La Cuesta
151147b718 cli: Add json and -t flags to server members command (#16444)
* cli: Add  and  flags to server members

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Update website/content/docs/commands/server/members.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* cli: update the server memebers tests to use must

* cli: add flags addition to changelog

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-20 10:39:24 +01:00
Adam Pugh
cd8615d9ec Spelling update (#16553)
updated propogating to propagating
2023-03-20 09:24:41 +01:00
Piotr Kazmierczak
b95b105288 cli: nomad login command should not require a -type flag and should respect default auth method (#16504)
nomad login command does not need to know ACL Auth Method's type, since all
method names are unique. 

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-17 19:14:28 +01:00
James Rasell
57a3cbe264 docs: add binding-rule selector escape example on Windows PS (#16273) 2023-03-17 15:13:35 +01:00
Michael Schurter
282e3bcfcc Enable ACLs on E2E test clients (#16530)
* e2e: uniformly enable acls across all agents

* docs: clarify that acls should be set everywhere
2023-03-16 14:22:41 -07:00
Michael Schurter
46ae1025bb docs: dispatch_payload and jobs api docs had some weirdness (#16514)
* docs: dispatch_payload docs had some weirdness

Docs said "Examples" when there was only 1 example. Not sure what the
floating "to" in the description was for.

* docs: missing a heading level on jobs api docs
2023-03-16 09:42:46 -07:00
Anthony
362f752b87 Updated trial license link and wording 2023-03-14 09:31:06 -04:00
Juana De La Cuesta
eaf22f2f68 cli: Add -json and -t flags to namespace status command (#16442)
* cli: Add  and  flag to namespace status command

* Update command/namespace_status.go

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* cli: update tests for namespace status command to use must

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-03-14 14:23:04 +01:00
Tim Gross
101e5d0225 docs: clarify migration behavior under nomad alloc stop (#16468) 2023-03-14 09:00:29 -04:00
Luiz Aoqui
f2bfbfaf03 acl: update job eval requirement to submit-job (#16463)
The job evaluate endpoint creates a new evaluation for the job which is
a write operation. This change modifies the necessary capability from
`read-job` to `submit-job` to better reflect this.
2023-03-13 17:13:54 -04:00
Dao Thanh Tung
f3a527b26f doc: Update nomad fmt doc to run against non-deprecated HCL2 jobspec only (#16435)
Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>
2023-03-13 15:26:27 -04:00
Tim Gross
d0ddd5e84b acl: prevent privilege escalation via workload identity
ACL policies can be associated with a job so that the job's Workload Identity
can have expanded access to other policy objects, including other
variables. Policies set on the variables the job automatically has access to
were ignored, but this includes policies with `deny` capabilities.

Additionally, when resolving claims for a workload identity without any attached
policies, the `ResolveClaims` method returned a `nil` ACL object, which is
treated similarly to a management token. While this was safe in Nomad 1.4.x,
when the workload identity token was exposed to the task via the `identity`
block, this allows a user with `submit-job` capabilities to escalate their
privileges.

We originally implemented automatic workload access to Variables as a separate
code path in the Variables RPC endpoint so that we don't have to generate
on-the-fly policies that blow up the ACL policy cache. This is fairly brittle
but also the behavior around wildcard paths in policies different from the rest
of our ACL polices, which is hard to reason about.

Add an `ACLClaim` parameter to the `AllowVariableOperation` method so that we
can push all this logic into the `acl` package and the behavior can be
consistent. This will allow a `deny` policy to override automatic access (and
probably speed up checks of non-automatic variable access).
2023-03-13 11:13:27 -04:00
Juana De La Cuesta
712c669657 cli: add -json and -t flag for alloc checks command (#16405)
* cli: add -json flag to alloc checks for completion

* CLI: Expand test to include testing the json flag for allocation checks

* Documentation: Add the checks command

* Documentation: Add example for alloc check command

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* CLI: Add template flag to alloc checks command

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* CLI: Extend test to include -t flag for alloc checks

* func: add changelog for added flags to alloc checks

* cli[doc]: Make usage section on alloc checks clearer

* Update website/content/docs/commands/alloc/checks.mdx

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

* Delete modd.conf

* cli[doc]: add -t flag to command description for alloc checks

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
Co-authored-by: Juanita De La Cuesta Morales <juanita.delacuestamorales@juanita.delacuestamorales-LHQ7X0QG9X>
2023-03-10 16:58:53 +01:00
Luiz Aoqui
4fdb5c477e cli: remove hard requirement on list-jobs (#16380)
Most job subcommands allow for job ID prefix match as a convenience
functionality so users don't have to type the full job ID.

But this introduces a hard ACL requirement that the token used to run
these commands have the `list-jobs` permission, even if the token has
enough permission to execute the basic command action and the user
passed an exact job ID.

This change softens this requirement by not failing the prefix match in
case the request results in a permission denied error and instead using
the information passed by the user directly.
2023-03-09 15:00:04 -05:00
Bryce Kalow
1dd320387a docs: update content-conformance package (#16412) 2023-03-09 12:47:46 -06:00