Commit Graph

3997 Commits

Author SHA1 Message Date
Luiz Aoqui
7edf1885c4 docs: fix link and add note about Nomad v1.3.0 on raft v3 upgrade (#12378) 2022-03-25 10:11:46 -04:00
dgotlieb
deedd790ce Add grpc and http2 listeners to gateway docs (#12367)
Stating at Nomad version 1.2.0 `grpc` and `http2` [protocols are supported](https://github.com/hashicorp/nomad/pull/11187)
2022-03-24 17:09:19 -04:00
Seth Hoenig
8e77776c20 Merge pull request #12274 from hashicorp/f-cgroupsv2
client: enable cpuset support for cgroups.v2
2022-03-24 14:22:54 -05:00
Seth Hoenig
c27af79add client: cgroups v2 code review followup 2022-03-24 13:40:42 -05:00
Tim Gross
a7e5df8381 csi: add -secret and -parameter flag to volume snapshot create (#12360)
Pass-through the `-secret` and `-parameter` flags to allow setting
parameters for the snapshot and overriding the secrets we've stored on
the CSI volume in the state store.
2022-03-24 10:29:50 -04:00
Seth Hoenig
5da1a31e94 client: enable support for cgroups v2
This PR introduces support for using Nomad on systems with cgroups v2 [1]
enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux
distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems
for Nomad users.

Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer,
but not so for managing cpuset cgroups. Before, Nomad has been making use of
a feature in v1 where a PID could be a member of more than one cgroup. In v2
this is no longer possible, and so the logic around computing cpuset values
must be modified. When Nomad detects v2, it manages cpuset values in-process,
rather than making use of cgroup heirarchy inheritence via shared/reserved
parents.

Nomad will only activate the v2 logic when it detects cgroups2 is mounted at
/sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2
mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to
use the v1 logic, and should operate as before. Systems that do not support
cgroups v2 are also not affected.

When v2 is activated, Nomad will create a parent called nomad.slice (unless
otherwise configured in Client conifg), and create cgroups for tasks using
naming convention <allocID>-<task>.scope. These follow the naming convention
set by systemd and also used by Docker when cgroups v2 is detected.

Client nodes now export a new fingerprint attribute, unique.cgroups.version
which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by
Nomad.

The new cpuset management strategy fixes #11705, where docker tasks that
spawned processes on startup would "leak". In cgroups v2, the PIDs are
started in the cgroup they will always live in, and thus the cause of
the leak is eliminated.

[1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Closes #11289
Fixes #11705 #11773 #11933
2022-03-23 11:35:27 -05:00
Tim Gross
879e137457 drainer: defer CSI plugins until last (#12324)
When a node is drained, system jobs are left until last so that
operators can rely on things like log shippers running even as their
applications are getting drained off. Include CSI plugins in this set
so that Controller plugins deployed as services can be handled as
gracefully as Node plugins that are running as system jobs.
2022-03-22 10:26:56 -04:00
Luiz Aoqui
eca4ac67f3 cli: display Raft version in server members (#12317)
The previous output of the `nomad server members` command would output a
column named `Protocol` that displayed the Serf protocol being currently
used by servers.

This is not a configurable option, so it holds very little value to
operators. It is also easy to confuse it with the Raft Protocol version,
which is configurable and highly relevant to operators.

This commit replaces the previous `Protocol` column with the new `Raft
Version`. It also updates the `-detailed` flag to be called `-verbose`
so it matches other commands. The detailed output now also outputs the
same information as the standard output with the addition of the
previous `Protocol` column and `Tags`.
2022-03-17 14:15:10 -04:00
Luiz Aoqui
81687c1ce5 api: add related evals to eval details (#12305)
The `related` query param is used to indicate that the request should
return a list of related (next, previous, and blocked) evaluations.

Co-authored-by: Jasmine Dahilig <jasmine@hashicorp.com>
2022-03-17 13:56:14 -04:00
Luiz Aoqui
dfe520a9d8 server: transfer leadership in case of error (#12293)
When a Nomad server becomes the Raft leader, it must perform several
actions defined in the establishLeadership function. If any of these
actions fail, Raft will think the node is the leader, but it will not
actually be able to act as a Nomad leader.

In this scenario, leadership must be revoked and transferred to another
server if possible, or the node should retry the establishLeadership
steps.
2022-03-17 11:10:57 -04:00
Tim Gross
d371f456dc docs: clarify restart inheritance and add examples (#12275)
Clarify the behavior of `restart` inheritance with respect to Connect
sidecar tasks. Remove incorrect language about the scheduler being
involved in restart decisions. Try to make the `delay` mode
documentation more clear, and provide examples of delay vs fail.
2022-03-14 15:49:08 -04:00
Luiz Aoqui
ebbbedd984 docs: initial docs for the new API features (#12094) 2022-03-14 10:58:42 -04:00
Luiz Aoqui
ddbbda6561 api: apply consistent behaviour of the reverse query parameter (#12244) 2022-03-11 19:44:52 -05:00
Luiz Aoqui
4a21dbcfaa docs: add namespace param to job parse API (#12258) 2022-03-10 16:35:07 -05:00
Tim Gross
a9d64b8e3e docs: add note about docker DNS config when using bridge mode (#12229)
The Docker DNS configuration options are not compatible with a
group-level network in `bridge` mode. Warn users about this in the
Docker task configuration docs.
2022-03-08 11:59:20 -05:00
Merlin Scholz
6707062b0d docs: elaborate on networking issues with firewalld (#12214) 2022-03-08 09:49:29 -05:00
Mike Nomitch
5c6f9a0570 Merge pull request #12192 from hashicorp/website/add-new-tools
Add openapi and caravan to tools page
2022-03-07 11:21:24 -08:00
Ignacio Torres Masdeu
d83ea30ff9 docs: fix examples for set_contains_all and set_contains_any (#12093) 2022-03-07 13:55:57 -05:00
Michael Schurter
7c21709e9e Merge pull request #12138 from jorgemarey/f-ns-meta
Add metadata to namespaces
2022-03-07 10:19:33 -08:00
Tim Gross
bc40222e3e csi: add pagination args to volume snapshot list (#12193)
The snapshot list API supports pagination as part of the CSI
specification, but we didn't have it plumbed through to the command
line.
2022-03-07 12:19:28 -05:00
Tim Gross
711a9d9a8f csi: volume snapshot list plugin option is required (#12197)
The RPC for listing volume snapshots requires a plugin ID. Update the
`volume snapshot list` command to find the specific plugin from the
provided prefix.
2022-03-07 09:58:29 -05:00
Michael Schurter
c5922f27d1 docs: add meta to namespace docs 2022-03-04 14:18:57 -08:00
Mike Nomitch
dbc646d7c0 Updated OpenAPI info on tools page
Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
2022-03-04 12:54:08 -08:00
Mike Nomitch
432293f72f Add openapi and caravan to tools page 2022-03-04 09:56:21 -06:00
James Rasell
180bc01d81 docs: add note regarding HCLv2 func and interpolation. 2022-03-04 12:06:25 +01:00
Michael Schurter
a9f1dbefe4 Merge pull request #10808 from hashicorp/f-curl
cli: add operator api command
2022-03-02 10:12:16 -08:00
Michael Schurter
3020b4e851 docs: add op api examples 2022-03-01 17:15:26 -08:00
Michael Schurter
a1000ee5b8 docs: add op api examples 2022-03-01 17:12:58 -08:00
Michael Schurter
ed95316bdf docs: add op api options 2022-03-01 16:43:53 -08:00
Ashlee M Boyer
2f14ceef05 docs: Fixing path for autoscaling/agent/source nav item (#12166) 2022-03-01 17:24:12 -05:00
Tim Gross
03a8d72dba CSI: implement support for topology (#12129) 2022-03-01 10:15:46 -05:00
Tim Gross
3fd968310d CSI: use HTTP headers for passing CSI secrets (#12144) 2022-03-01 08:47:01 -05:00
Tim Gross
05034a6cd0 docs: clarify that plugin commands are for CSI only (#12151) 2022-03-01 07:57:41 -05:00
Kevin Wang
636345a167 fix(website): hide version select on /plugins & /tools (#12145)
* fix(website/plugins): display version select

* fix: hide version select on `/tools` + `/plugins`
2022-02-28 12:44:08 -05:00
Jorge Marey
408a0edb17 Add metadata to namespaces 2022-02-27 09:09:10 +01:00
Michael Schurter
10b60d5983 docs: fix nav for op api 2022-02-25 16:21:14 -08:00
Seth Hoenig
3a6c824ea0 docs: clairfy advertise.rpc effect
The advertise.rpc config option is not intuitive. At first glance you'd
assume it works like advertise.http or advertise.serf, but it does not.

The current behavior is working as intended, but the documentation is
very hard to parse and doesn't draw a clear picture of what the setting
actually does.

Closes https://github.com/hashicorp/nomad/issues/11075
2022-02-25 16:02:29 -06:00
Michael Schurter
6bc962fe03 rename nomad curl to nomad operator api 2022-02-24 15:52:54 -08:00
Michael Schurter
99cbdf2afc cli: add curl command
Just a hackweek project at this point.
2022-02-24 15:52:54 -08:00
Zachary Shilton
a10af1bce7 chore: bump docs-page for code-block fix (#12117)
* chore: bump to latest docs-page

* fix: bump to react-consent-manager patch

* chore: bump to consent-manager with events dep

* chore: bump to stable consent-manager release
2022-02-24 15:34:54 -05:00
Luiz Aoqui
48184772b0 docs: add docs for the autoscaler on_error and on_check_error configuration (#12083) 2022-02-24 12:12:29 -05:00
Sander Mol
0ae76b1af4 add go-sockaddr templating support to nomad consul address (#12084) 2022-02-24 09:34:54 -05:00
Florian Apolloner
b84f70aef6 namespaces: allow enabling/disabling allowed drivers per namespace 2022-02-24 09:27:32 -05:00
Seth Hoenig
96a6f2c956 docs: emphasize snapshot before upgrading 2022-02-24 08:22:41 -06:00
Seth Hoenig
16efcf4e71 core: switch to go.etc.io/bbolt
This PR swaps the underlying BoltDB implementation from boltdb/bolt
to go.etc.io/bbolt.

In addition, the Server has a new configuration option for disabling
NoFreelistSync on the underlying database.

Freelist option: https://github.com/etcd-io/bbolt/blob/master/db.go#L81
Consul equivelent PR: https://github.com/hashicorp/consul/pull/11720
2022-02-23 14:26:41 -06:00
Tim Gross
7bcf0afd81 CSI: allow for concurrent plugin allocations (#12078)
The dynamic plugin registry assumes that plugins are singletons, which
matches the behavior of other Nomad plugins. But because dynamic
plugins like CSI are implemented by allocations, we need to handle the
possibility of multiple allocations for a given plugin type + ID, as
well as behaviors around interleaved allocation starts and stops.

Update the data structure for the dynamic registry so that more recent
allocations take over as the instance manager singleton, but we still
preserve the previous running allocations so that restores work
without racing.

Multiple allocations can run on a client for the same plugin, even if
only during updates. Provide each plugin task a unique path for the
control socket so that the tasks don't interfere with each other.
2022-02-23 15:23:07 -05:00
Charlie Voiselle
53d55ee98a Fixed scheduler config examples (#12049) 2022-02-23 12:58:29 -05:00
Mike Nomitch
950ccaf177 Merge pull request #12065 from hashicorp/docs-add-form-link
Adding link to interview form
2022-02-22 11:05:20 -08:00
Luiz Aoqui
a9407111aa docs: update link to mount in Docker task driver (#12101) 2022-02-22 13:39:49 -05:00
Michael Schurter
2411d3afd2 core: remove all traces of unused protocol version
Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
`protocol_version` has always been `1`.

While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:

1. Nomad's RPC subsystem has been able to evolve extensively without
   needing to increment the version number.
2. Nomad's HTTP API has evolved extensively without increment
   `API{Major,Minor}Version`. If we want to version the HTTP API in the
   future, I doubt this is the mechanism we would choose.
3. The presence of the `server.protocol_version` configuration
   parameter is confusing since `server.raft_protocol` *is* an important
   parameter for operators to consider. Even more confusing is that
   there is a distinct Serf protocol version which is included in `nomad
   server members` output under the heading `Protocol`. `raft_protocol`
   is the *only* protocol version relevant to Nomad developers and
   operators. The other protocol versions are either deadcode or have
   never changed (Serf).
4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
   don't think these configuration parameters and variables are the best
   choice. If we come to that point we should choose a versioning scheme
   based on the use case and modern best practices -- not this 6+ year
   old dead code.
2022-02-18 16:12:36 -08:00