Commit Graph

4507 Commits

Author SHA1 Message Date
Tim Gross
acfb4e679a docs: expand pprof documentation on goroutine profiles (#18172) 2023-08-08 08:33:42 -04:00
Devashish Taneja
472693d642 server: add config to tune job versions retention. #17635 (#17939) 2023-08-07 14:47:40 -04:00
Tim Gross
902f640c80 docs: fix URL in agent pprof examples (#18142) 2023-08-03 16:05:53 -04:00
Karuppiah Natarajan
2fd508d4f1 docs: fix link for stopping an agent (#18130) 2023-08-02 11:51:45 -04:00
Tim Gross
4fb5bf9a16 cli: support wildcard namespace in alloc subcommands (#18095)
The alloc exec and filesystem/logs commands allow passing the `-job` flag to
select a random allocation. If the namespace for the command is set to `*`, the
RPC handler doesn't handle this correctly as it's expecting to query for a
specific job. Most commands handle this ambiguity by first verifying that only a
single object of the type in question exists (ex. a single node or job).

Update these commands so that when the `-job` flag is set we first verify
there's a single job that matches. This also allows us to extend the
functionality to allow for the `-job` flag to support prefix matching.

Fixes: #12097
2023-07-31 13:15:15 -04:00
Gunnar
76ebb3fe55 docs: added accessor info to Tuples in template.mdx (#18101) 2023-07-31 11:03:12 -04:00
Gerard Nguyen
9e98d694a6 feature: Add new field render_templates on restart block (#18054)
This feature is necessary when user want to explicitly re-render all templates on task restart.
E.g. to fetch all new secrets from Vault, even if the lease on the existing secrets has not been expired.
2023-07-28 11:53:32 -07:00
Luiz Aoqui
ee31916c3b cli: add help message for -consul-namespace (#18081)
Add missing help entry for the `-consul-namespace` flag in `nomad job
run`.
2023-07-28 10:22:59 -04:00
James Rasell
0a32d7ff5b docs: add allocation checks API documentation. (#18078) 2023-07-28 08:49:14 +01:00
Luiz Aoqui
55723e5a3b website: add Nomad Ops to Tools (#18006) 2023-07-24 11:32:54 -04:00
Luiz Aoqui
ce0f60fb68 metrics: report task memory_max value (#17938)
Add new `nomad.client.allocs.memory.max_allocated` metric to report the
value of the task `memory_max` resource value.
2023-07-19 16:50:12 -04:00
Nando
ca26673781 volume-status : show namespace the volume belongs to (#17911)
* volume-status : show namespace the volume belongs to
2023-07-19 16:36:51 -04:00
louievandyke
0d343f269a docs: updating to specify mTLS rpc endpoints (#17963) 2023-07-19 14:16:35 -04:00
Luiz Aoqui
54c45ed106 acl: fix parsing of policies with blocks w/o label
An ACL policy with a block without label generates unexpected results.
For example, a policy such as this:

```
namespace {
  policy = "read"
}
```

Is applied to a namespace called `policy` instead of the documented
behaviour of applying it to the `default` namespace.

This happens because of the way HCL1 decodes blocks. Since it doesn't
know if a block is expected to have a label it applies the `key` tag to
the content of the block and, in the example above, the first key is
`policy`, so it sets that as the `namespace` block label.

Since this happens internally in the HCL decoder it's not possible to
detect the problem externally.

Fixing the problem inside the decoder is challenging because the JSON
and HCL parsers generate different ASTs that makes impossible to
differentiate between a JSON tree from an invalid HCL tree within the
decoder.

The fix in this commit consists of manually parsing the policy after
decoding to clear labels that were not set in the file. This allows the
validation rules to consistently catch and return any errors, no matter
if the policy is an invalid HCL or JSON.
2023-07-19 10:38:08 -04:00
Seth Hoenig
1e7726ce93 docs: note windows requirement for workload identity (#17950)
Support for UDS sockets was added to Windows 10.
2023-07-14 12:51:25 -05:00
James Rasell
7f5d39fc02 docs: fix QEMU driver HCL plugin options example formatting. (#17930) 2023-07-14 10:22:45 +01:00
János Szathmáry
e53955bccc fix allowed metric values documentation for the Nomad APM plugin (#17928) 2023-07-13 18:25:40 -04:00
Seth Hoenig
8253ec86a2 docs: add plugin docs for pledge task driver (#17823)
* docs: add plugin docs for pledge task driver

Add pledge driver to the set of Community drivers.

* docs: cr feedback
2023-07-11 16:41:14 -05:00
Seth Hoenig
a4d0dcdc39 docs: update podman driver docs with v0.5.0 changes (#17824) 2023-07-11 13:51:25 -05:00
Seth Hoenig
80b9ff6436 docs: clarify using user on raw_exec driver (#17897) 2023-07-11 13:06:46 -05:00
Adrian Todorov
ef89b692d8 docs: clarify update stagger description and alternatives (#17896) 2023-07-11 13:53:21 -04:00
Seth Hoenig
01cb47be1b website: ignore user .env.* files (#17898) 2023-07-11 10:06:48 -05:00
Luiz Aoqui
99fb36e119 np: update docs and add test for nil lists (#17899)
Document and test that if a namespace does not provide an `allow` or
`deny` list than those are treated as `nil` and have a different
behaviour from an empty list (`[]string{}`).
2023-07-11 10:59:45 -04:00
Kévin Dunglas
79773a031f docs: fix typo in regex_replace.mdx (#17891) 2023-07-11 14:03:40 +01:00
Lance Haig
1541358ef3 Add the ability to customise the details of the CA (#17309)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-07-11 08:53:09 +01:00
Michael Schurter
5169950562 docs: v1.6.0 requires ipc_lock cap for mlock (#17881)
Fixes #17780
2023-07-10 11:53:07 -07:00
James Rasell
079f5d4d8d docs: detail Consul ACL token env var config option. (#17859) 2023-07-10 14:26:18 +01:00
Seth Hoenig
878e6b9cf4 website: use full registry name so it works with podman again (#17809) 2023-07-06 13:22:12 -05:00
am-ak
bb95009305 docs: fix broken link in security model docs (#17812)
correcting a broken link under "similar to consul" and correcting list formatting under "general mechanisms"
2023-07-06 10:01:36 -04:00
Patric Stout
ede662a828 metrics: add "total_ticks_count" for CPU metrics (#17579)
This counter tells you the total amount of ticks for that CPU
entry since the start of Nomad.
2023-07-05 10:28:55 -04:00
James Rasell
c883621a9e docs: fix up constraint jobspec HCL format. (#17795) 2023-07-04 13:33:46 +01:00
Tim Gross
63d6af6187 docs: clarify network topology requirements for clients (#17779)
The requirements for client-to-server and client-to-client topologies are not
well-documented in the production install requirements docs. Document that
clients make connections to servers (and not the other way around), and that
clients don't need to communicate with each other (with some exceptions).

Fixes: #17631
2023-06-30 10:46:29 -04:00
Tim Gross
fc611fc5f4 docs: clarify drain's -force flag behavior with system/CSI jobs (#17703)
If you use `nomad node drain -force`, the drain deadline is set to -1ns. If you
have not prevented system and CSI node plugin allocations from being drained
with `-ignore-system`, they will be immediately drained as well. This is
typically not safe for CSI node plugins.

Also fix some broken links.

Fixes: #17696
2023-06-23 16:38:11 -04:00
Luiz Aoqui
b7c2d65a0e build: add Docker image (#17017)
Co-authored-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com>
2023-06-23 15:57:09 -04:00
grembo
6f04b91912 Add disable_file parameter to job's vault stanza (#13343)
This complements the `env` parameter, so that the operator can author
tasks that don't share their Vault token with the workload when using 
`image` filesystem isolation. As a result, more powerful tokens can be used 
in a job definition, allowing it to use template stanzas to issue all kinds of 
secrets (database secrets, Vault tokens with very specific policies, etc.), 
without sharing that issuing power with the task itself.

This is accomplished by creating a directory called `private` within
the task's working directory, which shares many properties of
the `secrets` directory (tmpfs where possible, not accessible by
`nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the
container.

If the `disable_file` parameter is set to `false` (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.
2023-06-23 15:15:04 -04:00
Luiz Aoqui
f4c7182873 node pools: apply node pool scheduler configuration (#17598) 2023-06-21 20:31:50 -04:00
VishnuJin
102f73274b fingerprint: added windows os.build attribute to host fingerprint (#17576) 2023-06-21 10:53:50 -04:00
Luiz Aoqui
6c64847e1b np: scheduler configuration updates (#17575)
* jobspec: rename node pool scheduler_configuration

In HCL specifications we usually call configuration blocks `config`
instead of `configuration`.

* np: add memory oversubscription config

* np: make scheduler config ENT
2023-06-19 11:41:46 -04:00
Bruce Lok
8953e78dc4 fix typo peers.json (#17538) 2023-06-19 07:56:51 +01:00
Luiz Aoqui
4f7c38b2a7 node pools: namespace integration (#17562)
Add structs and fields to support the Nomad Pools Governance Enterprise
feature of controlling node pool access via namespaces.

Nomad Enterprise allows users to specify a default node pool to be used
by jobs that don't specify one. In order to accomplish this, it's
necessary to distinguish between a job that explicitly uses the
`default` node pool and one that did not specify any.

If the `default` node pool is set during job canonicalization it's
impossible to do this, so this commit allows a job to have an empty node
pool value during registration but sets to `default` at the admission
controller mutator.

In order to guarantee state consistency the state store validates that
the job node pool is set and exists before inserting it.
2023-06-16 16:30:22 -04:00
Tim Gross
6ea36f248e node pools: support node.pool constraint in scheduler (#17548)
Although most of the time jobs will be assigned to a single node pool, users may
want to set the node pool to "all" and then constraint to a subset of node
pools. Add support for setting a contraint like `${node.pool}`.
2023-06-16 13:31:46 -04:00
Tim Gross
2d7bead0ad docs: node pool specification (#17553) 2023-06-16 10:37:47 -04:00
Tim Gross
81f7e53e0d docs: fix broken link in variables spec page (#17554) 2023-06-15 15:57:00 -04:00
Tim Gross
288ff2f0c4 docs: add missing client.allocs metrics (#17540)
The docs were missing counter metrics emitted by the task runner around task
state changes.
2023-06-15 09:18:11 -04:00
Tim Gross
eee2315d5d docs: clarify node pool apply/delete behavior (#17529) 2023-06-14 15:58:53 -04:00
Tim Gross
068d0ea9af node pools: add pool as label on client metrics (#17528)
This changeset adds the node pool as a label anywhere we're already emitting
labels with additional information such as node class or ID about the client.
2023-06-14 15:58:38 -04:00
Tim Gross
0ac85db680 cli: fix missing -quiet flag for var init (#17526)
The `var init` command was intended to have support for a `-quiet` flag but it
was not documented and never parsed.
2023-06-14 14:52:46 -04:00
Tim Gross
6bd1ebed29 docs: note namespace apply/delete behaviors, fix metric (#17527)
This changeset includes some fixes to documentation discovered while working on
node pools, but we didn't want to include in the node pool PRs so they can get
backported easily:

* namespace apply/delete commands are forwarded to the authoritative region
* deleting a namespace requires there are no non-terminal jobs in any of the
  federated regions
* fixed a typo in the name of the `nomad.client.allocated.disk` metric
2023-06-14 14:52:06 -04:00
Tim Gross
0aeeaf1083 node pools: implement node pool init command (#17479)
Implement a `nomad node pool init` command that generates an example spec file
in either HCL or JSON format.
2023-06-13 14:51:29 -04:00
Luiz Aoqui
5db9e64cdd node pool: node pool upsert on multiregion node register (#17503)
When registering a node with a new node pool in a non-authoritative
region we can't create the node pool because this new pool will not be
replicated to other regions.

This commit modifies the node registration logic to only allow automatic
node pool creation in the authoritative region.

In non-authoritative regions, the client is registered, but the node
pool is not created. The client is kept in the `initialing` status until
its node pool is created in the authoritative region and replicated to
the client's region.
2023-06-13 11:28:28 -04:00